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ABSTRACT 

o 

^ ' We present a new measure of the angular two-point correlation function of 

Lyman-break galaxies (LBGs) at 2; ~ 3, obtained from the variance of galaxy 
counts in 2-dimensional cells. By avoiding binning of the angular separations, 

cri ' this method is significantly less affected by shot noise than traditional measures, 

and allows for a more accurate determination of the correlation function. We used 
a sample of about 1,000 galaxies with TZ < 25.5 extracted from the survey by 
Steidel and collaborators, and found the following results. At scales in the range 
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^ ■ 30 ^ ^ 100 arcsec, the angular correlation function w{6) can be accurately 

[■~~. ■ described as a power law with slope (3 = 0.501q;5q (1 a random) _o.io (systematic), 

2 ' shallower than the measure presented by Giavalisco et al. However, the spatial 

correlation length, derived by Limber deprojection, is in very good agreement 

rS^ • with the previous measures, confirming the strong spatial clustering of these 

I ■ sources. We discuss in detail the effects of both random and systematic errors, 

\Z( • in particular of the so called "integral constraint" bias, to which we set a lower 

^ ■ limit using numerical simulations. This suggests that the current samples do 

not yet provide a "fair representation" of the large-scale distribution of LBGs at 
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z ~ 3. An intriguing result of our analysis is that at angular separations smaller 
than 6 ^ 30 arcsec the correlation function seems to depart from the power-law 
fitted at larger scales and become smaller. This feature is detected at the ~ 90% 
confidence level and, if real, it can provide information on the number density 
and spatial distribution of LBGs within their host halos as well as the size and 
the mass of the halos. 

Subject headings: cosmology: observations - cosmology: theory - galaxies: clus- 
ters: general - galaxies: evolution 
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INTRODUCTION 



A key paradigm of cosmology is that the formation of galaxies, and of the large-scale 
structure in their spatial distribution, occurred as a result of the action of gravity, which 
amplified small fluctuations in the primordial mass-density field. Bound structures such as 
clusters, groups and galaxies formed by gravitational collapse of density perturbations, and 
star formation took place wherever the local physical conditions have permitted the baryonic 
gas to condense and cool (White and Rees 1979). Testing this scenario and reconstructing 
its timing sequence are two major goals of the observations. 

A number of studies in the local and moderately distant universe have attempted to use 
the clustering properties and peculiar velocities of galaxies to test the gravity paradigm (see, 
for example. Peacock et al. 2001 and references therein). The difficulty with this approach 
is that, in general, galaxies do not necessarily trace the mass-density field, and their "bias" 
as generic statistical tracers of the mass distribution is not directly measurable from the 
data. This bias is also very difficult to model, since it strongly depends on the physics of 
star formation in galaxies, which is currently poorly understood. While this limitation is 
probably not severe in the local universe, because the average bias of the present galaxy 
populations is small (e.g. Taylor et al. 2000; Padmanabhan, Tegmark & Hamilton 2001; 
Scoccimarro et al. 2001; Feldman et al. 2001), this is not the case at high redshifts, where 
the bias is very likely significantly larger. Furthermore, peculiar velocities of high redshift 
galaxies are extremely difficult to measure. 

A different approach is to directly compare the observed properties of the galaxy distri- 
bution to the predictions of the theory, in which the effects of the bias are explicitly taken 
into account either by means of specific recipes of star formation coupled with N-body sim- 
ulations (e.g. Kauffmann et al. 1999; Benson et al. 2000; Wechsler et al. 2001) or with 
hydrodynamic simulations (Katz, Hernquist & Weinberg 1999; Lewis et al. 2000). In this 
way, one actually assumes gravity as the main interaction responsible for galaxy and struc- 
ture formation and tests if this, in combination with the adopted physics of star formation, 
can satisfactorily reproduce the data. This technique is becoming increasingly popular at 
high redshifts (e.g. z > 2), where relatively large and well controlled samples of galaxies are 
becoming available. 

Recently, refinements in color selection criteria have made possible empirical studies of 
galaxy clustering in the high-redshift universe. Color selection, such as the Lyman-break 
technique (Steidel et al. 1996, 1998; Madau et al. 1996; Lowenthal et al. 1997) or the 
photometric redshift one (Budavari et al. 2000; Fernandez-Soto et al. 2001), allows one to 
efficiently identify classes of galaxies in a preassigned redshift range based on their spectral 
energy distribution. This has resulted in the compilation of large and well-controlled samples 
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of galaxies aX z > 2 which are suitable for clustering studies (Giavalisco et al. 1998, G98 
hereafter; Adelberger et al. 1998, A98 hereafter; Connolly et al. 1999; Arnouts et al. 1999; 
Giavalisco and Dickinson 2001). Since "Lyman-break galaxies" (LBGs hereafter) essentially 
consist of actively star-forming galaxies (in comparison, quiescent galaxies at high redshifts 
are much less efficiently identified with the current instrumentation), a major goal of these 
studies is to provide empirical information on the physics of star formation and on the 
effects of the light-to-mass bias in determining the final observed clustering properties of 
the galaxies. 

The most widely used statistics to quantify galaxy clustering is the two-point correlation 
function, both in its angular - w{6) - and spatial -^(r)- versions. These functions measure 
the excess probability (with respect to a Poisson point-process) of finding galaxy pairs with 
a given angular or spatial separation (e.g. Peebles 1980). At high redshifts, where the 
samples are still too small for more sophisticated studies, the two-point functions are often 
the only statistics that can be practically used. The important result that came from the 
high-redshift samples is that at 2; ~ 3 star-forming galaxies are apparently rather strongly 
clustered, with a (comoving) two-point correlation length that rivals that of the galaxies in 
the local universe (Steidel et al. 1998; G98; A98; Connolly et al. 1998; Arnouts et al. 1999; 
Giavalisco & Dickinson 2001). This is interesting, because if the distant galaxies trace the 
mass in the same way as the local galaxies do, a spatial clustering at 2; ~ 3 almost as strong 
as at 2; = is very difficult to explain simply in terms of gravitational instability. This is 
true for any reasonable choice of the background cosmology in which the power spectrum of 
linear density fluctuations is normalized to reproduce the present-day abundance of massive 
clusters (Eke, Cole & Frenk 1996; Jenkins et al. 1998). If gravity is the interaction responsible 
for structure formation in the universe, the strong clustering at high redshifts implies that 
the galaxies at 2; ~ 3 were more biased tracers of the mass distribution than their current 
counterparts are today, and, in particular, that they resided in regions of the mass-density 
field that were spatially more clustered than the mass itself on average. 

A simple explanation for the strong bias is provided by the theory of biased galaxy 
formation (White & Rees 1978; Kaiser 1984; Peacock & Heavens 1985; Bardeen et al. 1986) 
which postulates that galaxies form within virialized dark matter haloes. A general pre- 
diction is that the clustering amplitude of the most massive halos at any given epoch is 
amplified with respect to that of the mass distribution, while very small halos are nearly 
good tracers of the mass-density field (e.g. Mo & White 1996; Catelan et al. 1997; Por- 
ciani et al. 1998). The mechanism explains the observations if the observed galaxies form 
within relatively massive halos (G98; A98), and also suggests a method to constrain the 
mass spectrum of the observed galaxies under the assumption of a cosmological model and a 
spectrum of linear density fiuctuations (Giavalisco & Dickinson 2001). In general, the strong 
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clustering of high-redshift galaxies has been regarded as indication of the overall robustness 
of the theory and as evidence of the reality of galaxy biasing (cf. Pierce et al. 1999; Baugh 
et al 1999; Governato et al. 1998). 

While the detection of a somewhat strong clustering at high redshifts seems to be robust, 
however, the current samples of LBGs still contain too few objects and cover too small an 
area of the sky (the largest survey so far covers only ^ 0.3 square degrees — cf. Steidel et al. 
1999) to accurately measure the correlation function or other clustering statistics. Not only 
the signal-to-noise of the current measures is of the order of ~ 3 at best, but the dispersion 
of different measures suggests the possibility of systematic errors. Thus, an important goal 
of the observations is to asses how reliably the clustering properties of LBGs have been 
determined, and to accurately measure the shape of the correlation function. For example, 
the correlation length measured by Steidel et al. (1998) and A98 is somewhat larger than 
the one reported by G98. Giavalisco & Dickinson (2001) find evidence that the correlation 
length might depend on the UV luminosity of the galaxies, possibly decreasing by as much 
as a factor of ^ 3 when going from ground-based samples with flux limit TZ ~ 25.5 to the 
HDF with TZ ~ 27. However, Arnouts et al. (1999) flnd that the correlation function of 
galaxies at z ~ 3 in the HDF identifled with photometric redshifts is only marginally smaller 
than that of the LBGs of the ground-based sample. 

In this paper we present a new, more accurate measure of the two-point angular corre- 
lation function, w{d), of Lyman-break galaxies at 2; ~ 3 with TZ < 25.5, and a study of the 
systematic and random errors involved in the measure. Currently, redshift surveys of LBGs 
do not yet allow direct, robust measures of their (small-scale) three-dimensional clustering. 
A valid alternative is to use the available redshift information in the form of a distribution 
function and deproject the angular (two-dimensional) clustering of the galaxies to derive the 
spatial one (Peebles 1980). Although projection effects decrease the signal-to-noise ratio in 
the clustering signal, angular samples generally contain many more galaxies than the redshift 
ones, and are easier to compile because less subject to the systematics induced by the sam- 
pling technique. Moreover, they are insensitive to the effects of redshift-space distortions 
induced by peculiar velocities. In the case of the LBGs at z ~ 3 the method is particularly 
advantageous, because the redshift distribution is accurately known (e.g. Steidel et al. 1999) 
and covers a relatively small interval of redshift (see the discussion in G98). 

The new measure is based on the method of the counts-in-cells (CIC) applied to the 
angular positions of the galaxies. We use a sample of 971 LBGs photometrically selected from 
a ground-based survey in 8 distinct flelds. This sample contains ~ 100 galaxies more than the 
compilation of catalogs used by G98; however, the primary motivation to revisit the measure 
oiw{9) with the CIC is that this method is significantly less affected by shot noise (and hence 
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more accurate) compared to the estimators used by G98. The CIC technique automatically 
combines information coming from different scales (in practice, all the separations smaller 
than a threshold value), as opposed to the traditional estimators, where the distribution 
of angular separations is binned into relatively small intervals. Since discreteness errors 
represents the major source of uncertainty at small angular separations, the technique allows 
one to measure the clustering on small scales with improved accuracy. On the other hand, 
using the CIC technique one does not really measure the angular correlation function w{6), 
but rather its average over the distribution of separations between all the pairs of points 
which lie within a cell, w{Q), as a function of the cell size 6 (see equations (6) and (B-8) for 
details). One also has to deal with highly correlated data points when fitting models to the 
data. These are relatively minor penalties, however, particularly if the correlation function 
has a simple shape, for example a power-law as it turns out to be the case for LBGs, because 
then it is straightforward to derive the parameters of w{6) from w{Q) (these functions are 
directly proportional to each other). 

The outline of the paper is as follows. In Section 2 we describe our sample of LBGs. In 
Section 3 we discuss how we estimated the galaxy correlation function and its uncertainty. In 
Section 4 we describe our technique for fitting a power-law model to the observed correlation 
function. In Section 5 we discuss a series of tests to assess the statistical significance of 
an apparent break in the scale-invariant properties of LBGs at small angular separations. 
Results from the CIC analysis are cross-checked with a different measure of the correlation 
function obtained counting galaxy pairs in Section 7. Finally, in Section 8, we summarize 
our results and discuss their implications for the properties of the hosting halos of LBGs and 
for models of galaxy formation. 



2. THE DATA SET 

The high efficiency of the Lyman-break technique at 2; ~ 3 (Steidel, Pettini & Hamilton 
1995; Madau et al. 1996; Lowenthal et al. 1997; Dickinson 1998), and the relative ease 
with which color selection criteria can be quantified and modeled (Steidel et al. 1999), make 
it particularly advantageous for constructing large and well controlled samples of galaxies 
that are nicely suited to study galaxy clustering at high redshift (G98; A98; Giavalisco & 
Dickinson 2001). 

Most of the data used in this work have been presented and discussed by G98, and 
the sample of LBGs considered here largely overlaps with the "PHOT" sample discussed 
by Giavalisco & Dickinson (2001). The only difference consists in the addition of one more 
field, dubbed CDFb in Table 1, which was acquired shortly after G98 was written. This field 
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has been obtained with the same instrumental configuration and in similar conditions as the 
other ones (except the Westphal field — see the discussion in G98), and it reaches the same 
fiux limit. The whole sample thus consists of 8 fields imaged in the custom photometric 
system UnGTZ, which has been designed to enhance the sensitivity to relatively unreddened 
star-forming galaxies at 2; ~ 3 through the Lyman-break technique (Steidel et al. 1999). As 
in the previous studies of the clustering properties of LBGs (e.g. see G98; A98; Giavalisco & 
Dickinson 2001), we have considered as LBG candidates all the objects whose colors satisfy 
the following relations 

{Un - G) > 1.0 + {G-n); (f/„ - G) > 1.6 ; (G - 7^) < 1.2 , (1) 

with the additional requirement TZ < 25.5 imposed to produce a reasonably complete sample. 
It is important to keep in mind that color criteria designed to select LBGs are, to some extent, 
arbitrary (see, e.g., Dickinson 1998; Steidel et al. 1999). The definition above is a relatively 
stringent color cut, and other criteria can certainly be defined that would result in larger 
samples of high-redshift galaxies. However, as a result of the combined effects of intrinsic 
scatter in the galaxies' spectral energy distribution and photometric errors, these would also 
contain a non negligible number of interlopers at lower redshifts. Since one of the goals of 
this paper is to measure the spatial correlation length of LBGs by deprojecting the angular 
correlation function, interlopers represent a source of systematic errors that would bias our 
estimates, and need to be minimized. With the spectroscopic information available, we have 
defined the color cuts in equation (1) in order to obtain an optimal balance between the 
competing requirements of having as large a sample as possible which is also as free of low- 
redshift interlopers as possible, i.e. maximizing the efficiency (see the discussion in Steidel et 
al. 1999). In this case, the only significant source of interlopers are galactic stars (3.4±0.8%), 
and all the galaxies that satisfy equation (1) and that have been confirmed spectroscopically 
were found to have redshift in the expected range for ?7„-band dropouts, i.e. 2.2 ^ z ^ 3.8. 
LBG candidates that remain unidentified have spectra with too low a signal-to-noise ratio 
to allow a secure measure of the redshifts, and in no case was an identified redshift found 
outside the range expected for LBGs selected using equation (1). It useful to remember, 
in any case, that conclusions on the clustering properties of the galaxies selected through 
Equation 1 do not necessarily apply to galaxies at similar redshift that are undetected by 
these color criteria. 

Table 1 details the essential statistics of the individual fields that compound the sample, 
which includes 971 galaxies that satisfy equation (1), after removing all the objects spectro- 
scopically classified as stars (see Steidel et al. 1999). The average surface density of LBGs 
with TZ < 25.5 and its standard deviation are 1.25 ± 0.22 arcmin"^, while the mean density 
over the whole sample is 1.24 arcmin"^. Only 469 of these galaxies have spectroscopic red- 
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shifts, and we expect some stellar contamination among the ~ 52% galaxies of the sample 
without redshift identification, which we have estimated to be ~ 1.4%. 

For the purpose of measuring the clustering properties, we recall that in presence of 
an unclustered population of spurious objects (with negligible cross-correlation with the 
LBGs) that contaminate the sample by the fractional amount /, the estimated and "true" 
correlations differ by a factor 1/(1 — /)^ (e.g. Maddox, Efstathiou & Sutherland 1996). 
This bias affects the amplitude of the two-point correlation function, but does not change 
its shape. Thus, stellar contamination will systematically bias low our estimates by ~ 3%. 
Note that from the spectroscopic yield of the survey (^ 80%, Steidel et al. 1999) a strict 
upper limit to the contamination 1/(1 — /)^ ~ 1.5 is obtained if one assumes that all the 
missed spectroscopic identifications of LBG candidates are interlopers. 

The redshift distribution N{z) of the galaxies selected by means of equation (1) has 
been very well measured by an extensive spectroscopic program (see Steidel et al. 1999 for 
a discussion of the observations). We refer to Figure 1 of Giavalisco & Dickinson (2001) for 
a plot of the function N{z) of the sample discussed here. We will use this function when 
computing the Limber deprojection of the angular correlation function. 



3. MEASURING THE ANGULAR CORRELATION FUNCTION FROM 

COUNTS-IN-CELLS 

We shall now describe how we have measured the correlation function of LBGs from 
the count statistics. A proper treatment of the shot noise is crucial to reliably extract 
the information on w{6) from low signal-to-noise data like ours. For this reason we have 
followed two different methods to reconstruct w{6) from the counts that adopt very different 
strategies to account for the shot noise. In one, discussed in §3.1, the contribution of the 
shot noise is estimated and subtracted. In the other, presented in §3.4, the measure is based 
on maximum likelihood criteria, adopting a model for the galaxy CPDF. As we shall see, 
the two methods give consistent results. 



3.1. Counts— in— Cells and Factorial Moments 

The clustering properties of a distribution of points are completely characterized by the 
count probability distribution function (CPDF), P/v, defined as the probability of finding A^ 
objects in a randomly placed cell of fixed shape and size. In principle, the CPDF can be 
estimated from galaxy catalogs (e.g. Szapudi & Colombi 1996) by placing a number C ^ 1 



of identical cells of given size and shape at random in the sample's volume, and counting the 
fraction of cells that contain A^ objects, i.e. 



c 
C 



1 . ^ 



where P/v is the estimator of the CPDF, Ni is the number of galaxies found in the i-th cell. 



and 5ij the Kronecker delta symbol *^. 



In practice, even though the CPDF contains all the information about the clustering 
process, it is often more convenient to look at simpler and more directly interpretable statis- 
tics. Commonly used in clustering analysis are the central moments and cumulants of P/v, 
which are directly related to the galaxy A^-point correlation functions (see Appendix A). 
The factorial moments of order k of the CPDF are defined as 

F, = ((Ar)fc) = ^P,v(iV)fe, (3) 

N 

where (A^)^ = A^(A^ - 1) ... (A^ - A; + 1) is the A;-th falling factorial of A^ (e.g. Kendall, Stuart 
& Ord 1987). Factorial and standard moments are related through the Stirling numbers of 
the second kind, S{m,k), as follows (A^™) = YlT=o'^(^^^^)^k (e.g. Szapudi & Szalay 1993). 
If the point process under analysis (in our case, the galaxy distribution) is obtained by 
Poisson sampling of an underlying continuous field, the k-th factorial moment of the CPDF 
is equal to the k-th standard moment of the continuous field. Then, estimating the factorial 
moments in a galaxy sample is equivalent to computing the standard moments of the CPDF 
plus subtracting the shot-noise contribution (see Appendix A). 



An unbiased and consistent estimator of F^. is 

c 



n ^ 



(Ay, ,,) 



(Szapudi & Szalay 1996), where Pi is the probability that a galaxy in the cell i has been 
included in the catalogue (i.e. the product of the selection function and the sampling rate). In 
the following we will assume that pi has the same (unknown) value for all the cells contained 
in a given catalogue (i.e. we neglect any small-scale fluctuations due to extinction and 
inhomogeneities in the CCD frames). We further assume (except where explicitly stated) 



^The formula is only valid for complete galaxy sampling. For the most general case, see equations 3 and 
4. 
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that fluctuations between the selection functions of different catalogues are negligible (for a 
detailed discussion about plate to plate variations, see G98). 

Szapudi & Colombi (1996) and Szapudi, Colombi & Bernardeau (1999) discussed in 
detail the problem of estimating the errors of the factorial moments of P/v- They showed 
that the proper error generating function (and thus the errors themselves) for a galaxy 
catalogue which covers a solid angle S, and is sampled with C ^ 1 cells, E'"''^, is given by 

where E°°''^ is the error associated with the finiteness of the catalogue (hereafter denoted 
as "cosmic error"), and E'-^ is the measurement error due to the finite number of cells used 
for computing P/y. Only E^'^ quantifies intrinsic properties of the galaxy catalog under 
analysis, and it represents the minimum uncertainty that can be obtained by extracting the 
whole information on galaxy clustering from the data. Note that systematic errors introduced 
during the observation and data reduction are not taken into account by equation (5). To 
accurately sample the tails of the CPDF it is important to consider the largest possible 
number of cells. It can be shown that, for large C, E^ oc C^^ and the corresponding constant 
of proportionality increases for smaller cells (Szapudi & Colombi 1996). The measurement 
error then can be made negligible respect to the cosmic variance by considering a sufficiently 
large number of cells ^ . The ideal case of infinite sampling can actually be achieved in practice 
using an algorithm developed by Szapudi (1998). Unfortunately, because of the complex 
geometrical structure of our samples, we could not adopt this method in our analysis. 



3.1.1. Estimating the correlation function of LBGs 

The connected moments of the galaxy distribution, i.e. the n-point correlation func- 
tions, can be estimated using non-linear combinations of the factorial moments (Szapudi & 
Szalay 1993, see also Appendix A). For example, the average of the two-point correlation 
function over a cell of linear size 6, which is defined as 

w{e) = ^fdn,f dn2w{\e2 - ^i|) , (6) 



''Often, in the literature, the global number of cells is selected by requiring the product between C and 
the cell surface to equal the area covered by the catalogue. In general, using such a small number of cells 
produces non-negligible measurement errors and, for high-precision determinations, massive oversampling 
with respect to this method is always required (see e.g. Figure 1 in Szapudi, Mciksin & Nichol 1996). 
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where the angles 9i and 0j specify a line of sight in the sky, dVt = sin (6*) d6 dcj) is the differential 
solid angle, S = S(0) is the solid angle subtended by the cell, and w{6) is the two-point 
angular correlation function expressed in terms of the angular separation 9 = |^2 — ^i|) can 
be estimated using 

w = ^-l. (7) 

In practice, we estimated P/v, Fi, F2 and w for our samples in circular cells with 47 (linearly 
equispaced) radii in the range 8 < 9 < 100 arcsec. As we shall discuss below in more detail, 
the upper limit for the cell size was chosen to minimize edge effects, while the lower bound 
was set because the shot-noise increases rapidly with decreasing O. For each cell size, we 
have determined the number of cells necessary to estimate the CPDF and its moments as 
follows. We initially set C = 10^ and estimated w several times times using different seeds for 
the pseudo-random number generator to determine the actual positions of the cells. If the 
variance between the estimates was found larger than the 1% of the corresponding average 
value, we increased C by a factor of ten and repeated the procedure. We found the following 
optimal values: C = 10® for 11 < < 25 arcsec, C = 10^ for 25 < < 55 arcsec, and 
C = 10^ for > 55 arcsec. All the sets of counts-in-cells used to determine the variance 
have then been combined together to estimate the CPDF. In order to minimize CPU-time, 
we used C = 10® also for = 8 and 10 arcsec even though this corresponds to ~ 3% 
measurement error. As we shall see in the next section, the uncertainty in the estimate of id 
due to cosmic variance decreases from ~ 200% to ~ 30% when the cell radius is increased 
from 8 to 100 arcsec. Thus, a random error of ~ 1-3% is, for practical purposes, negligible 
compared to the cosmic error. Note that using C = 10^ at = 8 arcsec would have caused 
a measurement error of order of 40%. 

Figure 1 shows the average correlation function measured with the CIC technique de- 
scribed above together with the error bars, computed as described in the following section. 
Note that w decreases monotonically for > 30 arcsec, it is approximately constant in the 
range 20 and 30 arcsec, and it decreases again for ^ 20 arcsec. As we will discuss in Section 
4, for ^ 40 arcsec w{Q) is very well modeled by power-law with slope ~ —0.5. At smaller 
scales this model does not adequately describe the data. It is important to determine if this 
apparent break in the scale-invariant clustering properties of the LBGs is real, derives from 
systematic errors in our analyses, e.g. from incorrect shot-noise subtraction, or is simply due 
to statistical fluctuations. This will be the subject of §5. 
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3.2. Error estimation 

To leading order in H/S (S and S are, respectively, the solid angles subtended by a cell 
and the survey) and assuming Poisson sampling (hereafter PS, see Appendix A), the cosmic 
error in the estimate of the factorial moments (and, thus, of the correlation functions of any 
order) can be broken into three components (see, e.g., Szapudi & Colombi 1996): 

• the discreteness error. Sampling a continuous field with a finite number of points 
always causes loss of information. In particular, recovering the statistical properties 
of the underlying field from the analysis of the corresponding point process becomes 
increasingly uncertain when the average density of sampling points is decreased. For 
counts-in-cells studies, this error increases towards small scales and considering higher 
order statistics; 

• the edge effect error. To minimize boundary effects, only cells which are completely 
included in the galaxy sample must be considered. As a result, the central part of 
each field is oversampled with respect to the regions lying near the boundaries. This 
introduces a bias in P/v whose magnitude increases with the cell size. In our case, 
for = 100 arcsec (circular cells) the effective surface of the LBGs samples (i.e. the 
area containing the centers of the cells) only covers the ~ 45% of the whole sample. 
Moreover, the fraction of cells included in each individual field is proportional to its 
effective surface. Thus, as a result, the weight of the Westphal sample in the estimate 
of P/v increases with the cell size. For instance, considering circular cells with radii in 
the range from 8 to 100 arcsec, the fraction of cells lying in the Westphal frame moves 
from 0.30 to 0.39; 

• the finiteness error. Estimating the clustering properties of a point-process from a 
finite sample is, in general, affected by random statistical uncertainties, which arise 
from the lack of information on the fluctuations of the density-field on scales larger 
than the sample size. This finite-volume error is proportional to the average of the 
two-point correlation function over the whole sample. The ensemble average of the 
finiteness error is known as the integral constraint bias (e.g. Peebles 1980). 

Szapudi & Colombi (1996) computed the leading order contributions to the errors for 
Fi and F2 assuming PS and hierarchical scaling of the moments. Their results depend on 
the field geometry, galaxy surface density and the correlation functions of third and fourth 
order. Unfortunately, in the case of the LBGs, no information on the n-point functions 
with n > 2 is available (a discussion on the higher order cumulants of the CPDF will 
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be presented in a future work). Furthermore, our sample consists of a number of sub- 
samples with different sizes. For these reasons, we did not derive the errors analytically, 
and we estimated the variance and bias of our statistics with a non-parametric method, 
namely using the bootstrap resampling technique (Efron 1979). In particular, to account for 
galaxy correlations, we adopted a variant of the (unmatched) blockwise-bootstrap method 
developed for the analysis of time-series (Kiinsch 1989). First, we divided the whole sample 
into 10 sub-samples covering nearly the same area on the sky. In particular, each of the seven 
9x9 arcmin^ fields of Table 1 has been considered as a sub-sample, while the Westphal 
field has been separately divided into 3 additional sub-samples. We then built B = 100 
"bootstrap samples" each of them composed by 10 sub-samples randomly drawn from the 
set described above. The sub-samples were not matched together to build larger fields, since 
this procedure would have changed the galaxy clustering properties. Estimates of the CPDF, 
-P^ , the first two factorial moments, F^ and F2 , and for the average correlation function, 
w , have been derived for each artificial sample with the same procedure adopted for the 
real data, and we have estimated the variance of the statistics S as 

B 

i 
B 



^2, 



6=1 



]_f2 [^(^) - (5(^)) 



where {<S^^^) = '^i,=i<S^'°^ /B. The absence of a large, Westphal-like sub-sample is the only 
difference between the original samples and the bootstrap galaxy samples. This implies that 
edge and finite-volume effects will be slightly more important in the bootstrap samples, 
particularly for large cells, and we expect that the error bars will be slightly overestimated. 

Our estimates of {cr^g)^^'^ are shown in Figure 1. Note that the signal-to-noise ratio for 
the average correlation function equals 1 at ^ 15 arcsec. We compared our results with the 
predictions of Szapudi & Colombi (1996) evaluated for a single 9x9 arcmin^ subcatalogue, 
adopting reasonable values for the hierarchical amplitudes of the higher order correlations of 
the LBGs - i.e., assuming that they are of the same order of magnitude of the values for local 
galaxies measured from the APM survey (Gaztanaga 1994; Szapudi et al. 1995). Bootstrap 
and analytical variances show the same trend with the cell size; however, as expected, the 
bootstrap estimates of (Xp and (Xp are approximately a factor of ~ 3 smaller than their 
analytical counterparts, since we combined together the data from all the samples, while we 
could compute the analytical one only from one. Furthermore, Colombi et al. (2000) have 
shown that analytical methods overestimate error amplitudes by a factor of ~ 2 at small 
scales, although this result has been obtained for three-dimensional data, in a regime with 
negligible discreteness errors. 

The mean value of a non-linear combination of two stochastic variables with non- 
vanishing variance generally differs from the same combination of their mean values. There- 
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fore, the statistics w, which is obtained from the ratio of the unbiased statistics Fi and F2, is 
a biased estimator of w, namely {w) 7^ w. To quantify the bias let us introduce the parameter 
h^ = (w) — w. Analytical estimates for the expectation value of b^ are available to leading 
order in S/S* (Bernstein 1994; Hui & Gaztafiaga 1999; Szapudi, Colombi & Bernardeau 
1999), and show that the bias results from a combination of discreteness, edge, and finite 
volume effects. Thus, we expect b^ to decrease if the variances of Fi and F2 become smaller, 
i.e. by considering larger and larger galaxy samples. Note that b^ includes also the so-called 
"integral constraint" bias (e.g. Peebles 1980) which derives from using the observed galaxy 
number density as an estimator of the mean density of the parent population. For relatively 
small samples, this is, more probably, an overestimate because of the existence of positive 
correlations between the galaxy positions at small separations. In other words, identifying 
the sample mean density with the population value corresponds to assuming that the aver- 
age correlation function over the surveyed region vanishes, i.e. positive correlations at small 
scales are balanced by a spurious lack of power at large separations. In our case, the galaxy 
sample is actually the combination of several fields obtained in different areas of the sky, and 
since all the sub-samples are covered with cells of the same size and shape, we automati- 
cally used the average density over the whole sample to normalize the correlation function. 
This reduces the integral constraint bias. If Si is the surface of the i-th sub-sample and 
Qi its hnear size, the integral constraint bias oi w is proportional to [^i Sfw{Qi)]/ (J2i ^i)^ ■ 
This result neglects the contribution by discreteness effects as in Bernstein (1994) and Hui 
& Gaztahaga (1999; see Szapudi, Colombi & Bernardeau (1999) for the general case). As 
expected, the combination of statistically independent data leads to an increased accuracy. 

We estimated the bias of w using the blockwise bootstrap method, namely 

6^ = (ti;("V^, (9) 

where we computed w from the whole sample with no resampling. Results are shown in 
Figure 1 together with the measure of w(6). As expected (Szapudi, Colombi & Bernardeau 
1999; Hui & Gaztafiaga 1999), on average, the biased estimator w tends to underestimate the 
actual clustering amplitude at all scales. For small angular separations, the bias is negligible 
with respect to the scatter due to cosmic variance. At larger scales, b^ and (o"^^)^/^ become 
comparable because of increasingly larger edge-effects. 



3.3. Analysis Of the Individual Catalogues 

In Section 3.1, we computed the angular two-point correlation function of LBGs com- 
bining together all our data. The results have been obtained throwing cells, at random, over 
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all the fields simultaneously (hereafter, method 1). In principle, this can generate system- 
atic errors due to field-to-field variations of the actual limiting magnitude, and photometric 
zero-points. Many physical effects as variable atmospheric conditions, galactic obscuration, 
and intergalactic extinction can be responsible for such inhomogeneities. Hence, to minimize 
spurious inter-field fluctuations of the density of objects, we also measured w for each single 
field, and averaged the results over the catalogues (hereafter, method 2). The drawback of 
this method is that it is affected by a strong integral constraint bias, which, following the 
discussion in §3.2, we estimate to be at least 5 times larger than for the technique used in 
Section 3.1 (this is obtained assuming negligible field-to- field variations). Our results are 
presented in Figure 2. As expected, fluctuations between the functions w of different fields 
are huge, often larger than the correlations themselves. This happens for two reasons, the 
presence of intrinsic fluctuations of the clustering amplitude among the different samples, 
and large fractional fluctuations in the mean density reflecting either that inter-field varia- 
tions are large, or that the correlation function on the scale of the fields is not negligible. It 
is practically impossible to distinguish between the contributions due to the different effects. 
As discussed in Section 3.2 of G98, however, we do not expect field-to-field variations due 
to "observational accidents" (i.e. due to varying observing conditions) to introduce a large 
spurious contribution to the observed clustering signal. The discrepancy between the results 
derived with the two methods should be mostly ascribed to the integral constraint bias, 
which is proportional to the average correlation function, w, evaluated on the typical scale 
of a single field. An estimate of the correlation function on the scale of ~ 9 arcmin can be 
obtained from the variance of galaxy counts in our 9x9 arcmin^ fields. However, since we 
only have 7 "cells", the measurement error is large. We find w = 0.026 ± 0.013. Note that 
this is probably an underestimate since we treated the different fields as independent. The 
discrepancy between the correlation function obtained averaging over the fields (represented 
by squares in the inset of Figure 2) and the results obtained with method 1 (shown with 
triangles) is ~ 0.05, independently of the scale. This is an estimate of the difference between 
the integral constraint bias of method 2 and the (probably small) systematic error arising 
from inter-field variations (for method 1). Note that the resulting bias is of the same order 
of magnitude of the signal. Estimators similar to method 2 have been used by G98, who as- 
sumed that systematic errors due to the integral constraint were negligible. Their estimates 
for the correlation function of LBGs might then be biased low. 



3.4. The Mclximum Likelihood Analysis 

In this section we will remeasure the function w{Q) with a maximum likelihood analysis, 
avoiding any direct shot-noise subtraction. In this case, discreteness effects are taken into 
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account adopting a model for the galaxy CPDF. The motivation for such a study is to 
determine how different treatments of shot-noise affect the measure of the galaxy correlation 
function. In particular, it is fundamental to check if the somewhat surprising behaviour of 
w{Q) at < 30 arcsec shown in Figure 1 is caused by discreteness effects (see also §5 for 
a specific discussion). For this reason, the results of the maximum likelihood analysis will 
be cross-checked with those obtained in the previous sections by measuring the factorial 
moments of the CPDF. 

The method of maximum likelihood provides a robust solution to the problem of estima- 
tion. Its key idea is that the best estimate of a parameter is that giving the highest chances 
that the observed set of measurements will be obtained. Knowledge of the probability of ob- 
taining a set of observations from a population is then required. In order to apply maximum 
likelihood analyses to the case of galaxy counts-in-cells, it is therefore necessary to make 
a strong "a priori" assumption for the shape of the population CPDF (and, depending on 
the method adopted, for the nature of the fluctuations of its estimates, obtained from finite 
samples, around the expectation value). Inappropriate models or assumptions will obviously 
result in biased estimates for the average correlation function. 

To the best of our knowledge, no theoretically justified model for the CPDF of the 
projected distribution of galaxies is available. For the maximum likelihood analysis, we 
then adopt two phenomenological models for the CPDF, the first is an extension of the 
negative binomial distribution and the second is the result of Poisson sampling a lognormal 
distribution. In both cases, the agreement with the data (see Figure 5) seems good enough 
to justify their use as first analytical approximations to the observed distribution. 



3.4-1 ■ The Negative Binomial Distribution 

The negative binomial (or modified Bose-Einstein) distribution has been used to de- 
scribe the PDF of discrete counts in a number of different fields ranging from various branches 
of physics, to biostatistics and econometrics. In a series of independent binary events (success 
or failure), having the same chances of success, the negative binomial distribution accounts 
for the probability of the number of trials necessary for the occurrence of a given number 
of successes. Among the most widely used discrete distributions, it represents the classical 
example for overdispersion (with respect to the Poisson distribution), since it is character- 
ized by ((A^ — NY) > N, where (N) = N. In cosmology, it has been originally adopted to 
describe the distribution of galaxy counts in a Zwicky cluster (Carruthers & Minh 1983) and 
then it has been extensively used in three-dimensional counts-in-cells analyses (e.g. Ueda & 
Yokoyama 1996 and references therein). 
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The dispersion of negative binomial distribution is characterized by an integer parame- 
ter. We adopt here a modified version in which the measure of the correlation is real valued: 

Pn = ^(1 + ^Ar)-^-V- J] (1 + ^^) . (10) 

1=1 

Elizalde & Gaztahaga (1992) showed how this distribution is obtained by modifying the Pois- 
son process to account for binary interactions between the points. In practice, w corresponds 
to a attraction/repulsion parameter, and P^ is obtained populating the cells sequentially 
(adding one object at a time), and assuming that the probability that a point belongs to a 
cell is proportional to the number of points which are already occupying this cell. 



3.4-2. The Discretized Lognormal Distribution 

The PDF of the mass density contrast is expected to be lognormal to a good approxima- 
tion (e.g. Coles & Jones 1991; Kofman et al. 1994; Bernardeau & Kofman 1995). Because 
of this, it has often been assumed that also the three-dimensional galaxy counts are log- 
normally distributed. Here, we use a Poisson-sampled lognormal distribution to model the 
two-dimensional galaxy counts 



This choice is not inspired by any theoretical motivation, and it is mainly dictated by sim- 
plicity. 

Note that both Pj^ and P^*^ are hierarchical (i.e. the correlation functions of order j 
Wj = KjW^^^ with Kj a constant) and reduce to the Poisson distribution when w -^ 0. 



3.4.3. Likelihood for the CPDF 

It has been recently proposed to directly use the CPDF for maximum likelihood analyses 
(Kim & Strauss 1998). For this purpose, it is necessary to know how the estimates P/v, 
derived from finite samples, are distributed around the corresponding population values P/v- 
For three-dimensional counts-in-cells, this problem has been addressed by Szapudi et al. 
(2000) using very large A^-body simulations. For our two-dimensional problem, we studied 
the nature of these fiuctuations (AP/v = P/v — Pn) using bootstrap resampling. As Szapudi 
et al. (2000), we also find that the distribution of each AP/v is non-Gaussian (and, for a 
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given cell's size, its skewness depends on A^). Only in the tails of the CPDF, where Pn is 
very low, errors nearly follow a Poisson distribution (cf. Kim & Strauss 1998). Moreover, 
fluctuations at different values of A^ are correlated (the covariance matrix of the AP/v must 
be singular because of the normalization constraint of the CPDF, X]w'=i ^n = !)• Writing 
a likelihood function which takes into account all these constraints is extremely challenging. 
Luckily, useful approximations that also provide some insight into the characteristics of the 
unknown distribution functions are available. In order to allow for correlations, we use a 
principal component analysis (see §4. 1 for a brief definition) of the AP/v resulting from the 
bootstrap analysis, and, to work with with a relatively simple likelihood function, we assume 
Gaussian errors. This seems to be a reasonable approximation, since the distribution of the 
eigenvectors of the covariance matrix (i.e. the principal components) indeed closely resemble 
a Gaussian function. Note that, when only a subset of principal components is considered, the 
assumption of gaussianity does not imply assuming that all the AP/v are normal variables, 
which in general will not be true. Subsequently, we write a Gaussian likelihood function 
(taking into account a finite number of principal components of AP/v) and we look for its 
maximum value by varying w. To deal with a one-dimensional minimization procedure, a 
value for N must be specified a priori. We considered two possibilities: the average density 
for the whole sample and Pi corresponding to the cell size under analysis. Similar results 
are obtained in the two cases. The use of Pi, however, results in a smoother correlation 
function. The results, shown in Figure 3, are in good agreement with our previous findings, 
confirming the presence of a small-scale break in the estimated angular correlation function. 
The most striking feature is that the new estimate of the correlation are biased low respect 
to those derived from the factorial moments of the CPDF. These results change only slightly 
when Gaussian and independent errors for AP/v are assumed. 

The Gaussian assumption could be easily released deciding not to keep track of cross- 
correlations between different values of N (e.g. Kim & Strauss 1998). In this case, we could 
directly adopt the probability distribution deriving from the bootstrap analysis: >C(w) oc 
nArPboot(AP/v). However, an accurate determination of the tails of Pboot would require 
extremely large amounts of CPU time, and the results would be in any case questionable. 
We did not follow this approach. 



3.4-4- Likelihood for Cell Counts 

A widely diffused technique for the estimation of w consists in applying maximum like- 
lihood methods directly to cell counts (e.g. Efstathiou et al. 1990; A98). In this case, the 
likelihood is simply the joint probability distribution of the observed counts. This quantity 
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reduces to the product of one-point probabilities whenever the cells are statistically inde- 
pendent (strictly speaking, this never happens when the cells are extracted from the same 
catalogue because of the density modes with wavelength larger than the separation between 
the centers of the cells). Independence is obviously violated in our case of overlapping cells. 
However, if we assume to have a fair sample of galaxies, one can invoke ergodicity to associate 
the counts in different cells with different realizations of the density field. In other words, if 
the size of the survey under analysis is much larger than any correlation length, statistically 
independent cells will dominate the spatial average (Szapudi & Colombi 1996). In practice, 
this means that, if one analyzes a fair sample, results obtained maximizing the likelihood 
function C{'w) oc Il^^PjVi should be correct even though the assumption of statistical inde- 
pendence of the cells in the observed sample is not. Biased estimates for w are obtained if 
the sample under analysis is not representative of the whole universe. Note the similarity 
between this method and the analysis of factorial moments: in both cases massive oversam- 
pling of a finite portion of a process is used to extract information about expectation values 
over the ensemble. We used this method to estimate w. Our results are shown in the top 
panels of Figure 4. The almost perfect agreement with the correlation obtained computing 
the factorial moments is striking and reinforces our confidence on the overall robustness of 
our measure. Note that the error bars, determined looking for the interval that corresponds 
to a decrement of log C by 1/2 from its maximum value, quantify only measurement errors 
and do not include cosmic errors. 



3.4-5. Kolmogorov-Smirnov Test 

Since none of the maximum likelihood analyses discussed in the previous section is 
rigorous from the conceptual point of view, in this section we apply the Kolmogorov-Smirnov 
test to compare the observed cumulative distribution function Cn = J2j=o Pn to the negative 
binomial and lognormal models (cf. Ueda & Yokoyama 1996). The value for w which 
corresponds to the highest significance level is taken as the estimate for the population 
value. Results are shown in the bottom panels of Figure 4. An apparent small-scale break 
in the correlation function is observed in this case as well. 

Note the discrepancy at large scales between the different estimates of w, whose nature 
can be understood by looking at Figure 5: neither of the two models for Pjq adopted in the 
maximum likelihood analysis is able to accurately describe the measured galaxy counts for 
large cell's sizes. This is because the observed Pjv sharply drops when A^ reaches a maximum 
value, while the models predict a smooth decrement of the count probability. An excess of 
counts with respect to the model is also detected for values of A^ slightly smaller than the 
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cutoff. We tested that the observed drop is not caused by the finite number of cells used in 
our analysis. After increasing by a factor of 10 the total number of cells used in the CIC 
analysis, we verified that the shape of the CPDF remained unchanged. 



4. FITTING THE DATA WITH A POWER-LAW MODEL 

It is customary to fit the observed correlation function (or its average, as in our case) 
with a power-law model, w(0) = AQ'^. In this section, we describe how we have combined 
a least squares method with a principal component analysis of the bootstrap errors to derive 
the best fitting power-law to the results obtained with the CIC analysis (Figure 1). Since 
both the techniques that we have used to estimate w(0) consistently return a correlation 
function that departs from a power-law on small scales, we have separately considered a fit 
over large angular scales only as well as a "global" fit. We have then compared the results 
obtained in these two cases to quantify the statistical significance of the apparent "break" . 



4.1. Principal Component Analysis of the Errors 

Cosmic errors for the average correlation function evaluated at different angular sepa- 
rations are strongly correlated. After defining Aj = [w{Qi) — (tD(0i))]/(j^(ei) (with o"^ the 
standard deviation of the estimator w), we estimate the covariance matrix Cij = (AjAj) 
using the bootstrap method described in Section 3.2, Cij = [Xl6=i ^i ^j ]/(-^ ~ 1)) "with 
Al = [w (6i) — {w (0i))]/[o'^^(ei)]^''^- It turns out that Cij is a quasi- singular matrix, 
since all its components differ from one by less than a few percent. 

The principal components of a A^-dimensional random vector are linear combinations of 
its A^ components, that are statistically orthonormal and form a complete basis. Since any 
covariance matrix is symmetric, the principal components of the correlation errors coincide 
with the eigenvectors of Cij. In fact, the eigenvectors of the covariance matrix are linear com- 
binations of the errors, -E^ = aij Aj, whose variances are given by the corresponding eigen- 
values, A„. Different eigenvectors are statistically orthogonal (i.e. (_E(") i^(™')) = A„5„m), 
and uncorrelated (since (Aj) = 0) ®. By ranking the eigenvectors in the order of descending 
eigenvalues (starting from the largest), one can create an ordered orthogonal basis with the 
first eigenvector having the direction of the largest variance of the data. 



^Note that this does not imply that the different eigenvalues are statistically independent, unless the Aj 
form a multivariate Gaussian process. 
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We reduced the covariance matrix Cij to its diagonal form using the LAPACK rou- 
tines for singular value decomposition (Anderson et al. 1999). The first principal com- 
ponent (which is the minimum distance fit to a line in the A-space) comes out to be 
E^^^ ~ (1, • • • , 1)/V^fit and Al ~ A^fit, with Agt the dimension of the A-space (as dis- 
cussed in the next section we consider data sets with different dimensionality). Note that 
the E^^^ accounts for a large fraction (typically, ^ 96%) of the variance of the bootstrap 
data. This means that the Agt residuals lie along a line with a relatively small scatter. In 
other words, for a given realization, cosmic errors tend to have the same sign and the same 
amplitude when expressed in units of (cr^^)^'^. As a consequence of this, the normalization 
of the average correlation function and its shape are poorly constrained by the data. This 
is another manifestation of large uncertainty in the mean surface density of LBGs due both 
to statistical fluctuations and to the fact that galaxies are spatially correlated on the sample 
scale. Figure 6 shows how w is affected by the most typical error conflgurations: a mono- 
tonically decreasing correlation on all-scales, as well as a flat curve for O > 30 arcsec are 
compatible with the data. 



4.2. Least Squares Fitting 

In order to flt a power-law function to the data for the average correlation function, we 
use a least squares method. In practice, we minimize 

i ^' 
where e^ represents the projection of the vector of the residuals. 



-/3 



AQ-^-wiQi) AQ-^-w{Q 



A(^^) = r^^^^^^^V^,...,^^^^^^^^^^^^ (12) 



^ «i(ei) 



1/2 



"^^^(eiv) 



1/2 



along the i-th principal component, namely A*^^*) = '^i^iE^'^K Note that, for A^gt = 47, 
the eigenvalues of Cij span over 8 orders of magnitude, and can be as small as ~ 10~^ 
(we verifled this by solving for the eigenvalues of the correlation matrix using the LAPACK 
routines for singular value decomposition). Thus, the value of x^ ^iH b^ typically dom- 
inated by the contribution coming from the principal components with the smallest vari- 
ances. However, fitting functions which deviate from the data only along the highest-order 
principal components should not be necessarily rejected. In fact, Cij is only an estimate of 
the "true" covariance matrix and contains an intrinsic uncertainty. These errors propagate 
in the computation of eigenvalues and eigenvectors combining also with numerical round-off 
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errors, especially for quasi-singular matrices. It is therefore reasonable to consider only the 
most-stable linear combinations of the data (i.e. the eigenvectors of Cij corresponding to 
the largest eigenvalues) in the fitting procedure. The problem is to determine how many 
principal components must be discarded. This issue is obviously related to the size of the 
uncertainties in the correlation matrix. We can identify two sources of errors: the finite 
number of bootstrap resamplings we used to compute the ensemble average, and the finite 
number of cells we used to compute the factorial moments. We know that the latter is going 
to produce ~ 1% relative errors in w, which cause Cij to be systematically underestimated 
by ~ lO^^Kwi/a^,-)'^ + (wj/o"^.)^]. The off-diagonal elements of the matrix Cij — Cij should 
then lie in the range 10~"^ — 10~^ (all the diagonal elements are, by definition, equal to 1). 

What about the other source of error? An unbiased estimate of the random uncertainty 
due to the finite number of resamplings for k2 = {[w (Oj) — w^'°\Qi)]'^) is given by {2Bk2 + 
{B — I)k4)/[B{B + 1)], where /c2 is computed as in equation (8), /c4 = B'^[{B + 1)1714 — 3(5 — 
l)ml]/[{B - 1){B - 2){B - 3)] is the estimated 4"^ order cumulant of ^^^^0^) - w'-'°\ei) 
with rrii the sample i-th central moment (Kenney & Keeping 1951, 1962). The relative error 
on k2 decreases with 0, ranging from 5 x 10~^ to 2 x 10~^ with typical values of ~ 10~^. 
Assuming that errors on the i — j covariances are of the same order of magnitude, one gets 
that uncertainties in Cij are a few times Ak2/k2, i.e. of order 10~^ — 10~^. As a final check, we 
split our bootstrap realizations into two halves and compared the corresponding correlation 
matrices: the maximum discrepancy was found to be of order 10~^, while the typical one 
was ~ lO^"'. It is important to remember that, if the errors in the components of a matrix 
are uncorrelated and of order e, then the errors in the eigenvalues are also of order e. Most 
importantly, errors significantly change the direction of the eigenvectors corresponding to 
\i < e. This would suggest to include in the x^ calculation only the first few eigenvectors. 
However, the situation changes when the errors in the different components of a matrix are 
strongly correlated. In this case, the direction of the eigenvectors is barely affected by the 
errors. Therefore, there is no strict rule to select the number of principal components, but 
it could be risky to consider eigenvectors corresponding to noisy eigenvalues. Even though 
some simple (but totally empiric) criteria for the selection of the components are commonly 
used in factor analysis, we prefer here to explore a number of different cases, using the 
fraction of variance which has been accounted for as a guiding parameter. 

Both bootstrap resampling and the analysis of mock galaxy catalogues (see §5.1) show 
that, analyzing a finite galaxy sample, it is more probable to underestimate w than to over- 
estimate it. This becomes more and more evident with increasing 0. In absence of a robust 
maximum likelihood method (the probability density function of Aj, and the correlation 
matrix Cij are not known a priori), a simple way to account for the positive skewness of the 
distribution of the residuals is realized by using our least-squares method after correcting 
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the data for the bias of the correlation estimator. We will use this technique in the following 
section. 



4.3. Results 

Because of the shape of w{Q), which does not resemble a single power law over the 
whole range of angular separations that we have studied, we have determined the best- 
fitting power-law functions to the results obtained with the CIC analysis (Figure 1) in two 
different intervals, 8 < < 100 arcsec, and 40 < < 100 arcsec. We performed a number 
of x^ minimizations, increasing the number of principal components of the residuals, and, 
for comparison, we also computed the best-fitting function assuming independent errors. 
The results, both with and without correction for the bias of our estimator, are listed in 
Table 2 and Table 3. The range of values of the parameters A and f3 that corresponds to 
Ax^ < 1 (i.e., for Gaussian residuals, the 68% confidence levels) is also given in the tables. 
The low values of Xmin P^^ degree of freedom (x^j^^/d.o.f.) obtained considering only the first 
few principal components suggest that they cannot efficiently discriminate among different 
power-law models. On the other hand, including too many principal components, one gets 
very high values of Xmin/d-O-f • and, as expected, the quality of fit worsen dramatically (not 
shown in the tables). In general, when a power- law gives a fairly good description of the 
data, the preferred values of A and f3 keep stable with increasing the number of principal 
components until a threshold is reached. Adding more components significantly changes the 
best-fitting values Abest and /Sbest- 

An important result from the counts-in-cells analysis is that for > 40 arcsec the 
correlation function of LBGs is very well described by a power-law model. While our data 
do not constrain separately the values of the parameters A and (3 with great accuracy, we 
find that the value A^ is reasonably well determined. As the figures 7 and 8 show, A and f3 
are strongly covariant, and regions of the plane with constant x^ value are rather elongated 
and nearly unidimensional. When the data are not corrected for the bias of w, taking the 
analysis with 4 principal components as a reference case, we find that (for (3 > 0) these 
constant x^ regions are centered around the line of equation, 

A ~ 0.62 (0.49 + /3)3-8+2-7/3 ^ (13) 

with a relatively small scatter. For bias corrected data, this becomes 

A~0.56(0.56 + /?)^-^+2-'^^. (14) 

The ranges of variability for the single parameters A and (3 can be determined by projecting 

0.25 
0.50; 



the curve corresponding to A^^ = 1 onto their axes. For the raw data, we find /3 = 0.50^'^'^^ 
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while allowing for for the bias of the correlation estimator corresponds to a slightly shallower 
slope, namely /3 = 0.401;q'4q. The amplitude of the best-fitting power-law model is even 
less precisely determined than the slope: when we do not correct for the bias we obtain 
A = O.Gto'e arcsec^, while when we account for b^ we get A = OA'tl'l arcsec^. This difference, 
however, is significantly smaller than the error bar relative to A. It is also interesting that the 
corresponding best-fitting curves lie on opposite sides with respect to the data (see Figures 7 
and 8). As we shall see is Section 8, the preferred values for A and (3 imply, through Limber 
deprojection, a strong spatial clustering, with a correlation length tq of a few Megaparsecs, 
in agreement with previous estimates by G98. Note that, before computing the best-fitting 
power-law models, we did not correct the data for the dilution of clustering caused by the 
residual contamination by stars in the galaxy catalogues, since this is negligible with respect 
to the overall uncertainty in the amplitude of the correlation function. 

An intriguing result of our analysis is that the best power-law fit at ^ 40 arcsec 
does not seem to describe well the average correlation function at smaller scales. Using the 
analysis with 4 principal components as a reference case, the best-model for O > 40 arcsec 
corresponds to Ax^ = X^ ^ Xmm = 2.73 at O > 20 arcsec, and to Ax^ = 5.95 at > 8 
arcsec '^. Taking these results at face value, it seems unlikely that our data are a realization 
of a point process with w = AQ~^ and f3 ~ 0.5 on scales 6 < 40 arcsec. As shown in the 
Figures 7 and 8, there is a relatively narrow region of the parameter space in which the 
value of Ax^ for a power-law model is acceptable both for O > 40 arcsec and > 8 arcsec. 
Thus, we cannot rule out the possibility that the apparent break in the correlation function 
is just due to a statistical fiuctuation. However, in this case, Xmm/^-'^-^- > 1 for > 8 
arcsec, making a single power-law over the whole range of angular separations an unlikely 
model, while the power-law fit is excellent for > 40 arcsec. We will describe in moment 
a series of additional tests that we have performed to quantify the statistical significance of 
the observed behaviour of w{Q) at small scales. 



5. THE SMALL-SCALE "BREAK" 

There are three possible areas where we can look for the cause of a spurious "break" in 
the correlation function at small separations, namely biased data acquisition or reduction, 
systematic errors in the estimate of w, and statistical fiuctuations (e.g. due to shot noise or 
cosmic variance). This section is devoted to a discussion of these issues. 



^This is obtained neglecting the correction for the bias of the estimator. Including the correction, yields 
Ax^ = 1.20 at e > 20 arcsec, and Ax^ = 3.61 at 9 > 8 arcsec. 
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We can think of no obvious ways to introduce an artificial small-scale break in an 
otherwise correlated distribution of galaxies as a result of systematics during the observations 
and data reduction. This would require missing from an angular sample pairs of galaxies 
separated by ~ 30 arcsec or less, a scale which is about a factor of 20 or more smaller 
than the size of the fields we have observed, and a factor of ~ 20 larger than the typical 
isophotal size of a LBG. As discussed by G98, it is possible that the data are affected by 
slight variations of sensitivity from field to field and across the fields, with the central part 
being slightly deeper than the edges. Field to field variations would introduce a bias similar 
to the integral constraint, but they would not change the shape of the correlation function 
at small scales, in particular mimicking the break. Intra field variations would artificially 
increase the whole clustering signal over scales of the order of a few arcmin, but they would 
not affect it only at small scale. 

It also seems unlikely that the break is an artifact of the data analysis. The most likely 
phase of the galaxy detection algorithm when systematics could be introduced is during 
the splitting of sources with blended isophotes. An improper splitting can potentially lead 
to underestimate the correlation function at very small angular separation if close pairs of 
galaxies are recorded as single objects by the detection software. We can estimate how 
many pairs could be potentially affected by undersplitting in our sample. For small angular 
separations, and assuming a power-law model w{6) = A^O^^ (with /3 < 2), the number of 
expected pairs separated by less than 9 is 



N^,U<0)-In^^27i [ 9'[l + w{9')]d9' = lNgAfn9' 
2 Jo ^ 



1+ 2^- 



(2-;3)»''J ' '^^' 

where J\f is the average surface density of LBGs, and A'"g the total number of galaxies in 
our sample. Since the typical isophotal size of LBGs in the images is in the range 1 < 
A^ < 2 arcsec, we expect than only pairs with 9 < 3-4 arcsec can be mistakenly considered 
as single objects. For the best fitting power-law to our data at > 40 arcsec, we find 
A^pairs(< 4 arcsec) ~ 11.4, comparable to 8 such close pairs observed in the real sample and 
suggesting that undersplitting is unlikely to be a factor in our sample. To quantify the effect 
of undersplitting, we have used a set of mock galaxy samples extracted from a correlated 
distribution with no break in the correlation function and artificially merged close pairs. 
These samples have the same average surface density and two-point correlation function 
(with no break at small scales) of the real LBGs (see Section 5.1 for details). Surprisingly, 
after merging all the pairs with angular separations 9 < 4 arcsec, we found that the estimated 
values for tZ;(6) at 8 ^ ^ 40 arcsec are actually larger than before. This is a consequence 
of the complex interplay between clustering and shot noise. What happens is that for O > 8 
arcsec, the second moment of the counts (A^^) is nearly unaffected by the merging of the 
close pairs, while (A^) decreases by a factor 1 + /crowd, with /crowd ^ 1- Hence, the term 
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{N'^)/{N)'^ increases by a factor ~ 1 + 2 /crowd, while the shot noise term 1/{N) increases 
only by a factor ~ 1 + f crowd- The net result is that the observed correlation function at 
0^8 arcsec increases. In other words, since the average of the correlation function over 
the sample must vanish, imposing w = —1 at small separations corresponds to increasing its 
value at larger 6. This shows that crowded isophotes cannot be the cause of the detected 
break in w. 

Systematic errors could also have been introduced during the measure of the correlation 
function. An important test to carry out when assessing the significance of the break is to 
check if it is at all possible, with our data set, to detect in a statistically significant way 
deviations of the CPDF measured at small scales from a Poisson distribution. That this can 
be done is not obvious, since at O = 8 arcsec, (A^) ~ 0.07, and the shot-noise contribution 
dominates the variance of the CPDF. Even if the large number of cells used in our analysis 
guarantees an optimal shot-noise subtraction (and the results of the maximum likelihood 
analysis presented in Section 3.4 agree with this expectation), it is possible that the break 
is simply the result of the inability to measure w{Q) at scales where the signal-to-noise is 
low ^. We carried out both the Kolmogorov-Smirnov and the Cramer-Smirnov-Von Mises 
(e.g. Eadie et al. 1971) tests and found that only at < 10 arcsec our data are compatible 
with being a random sampling from a Poisson distribution to the 95% confidence level. In 
the interval 10 ^ ^ 30 arcsec the data are significantly more clustered that the Poisson 
case, although less clustered than the extrapolation to small scales of the power law fit at 
^ 30 arcsec. Finally, we have also tested the stability of the break against artificially 
diluting our samples by a factor 4/3 and 2. Since shot noise is inversely proportional to the 
average density in the catalogue [see equation (A-4)], whenever discreteness effects are not 
subtracted correctly, changing the sampling rate introduces a bias in the estimate of w. We 
noticed no significant change in the shape of w averaging over 20 different sparse sampled 
realizations. As a further test, we have also split our catalogue into two sub-samples and 
repeated the measure of w{Q) in each of them. We took one sub-sample to be the Westphal 
catalogue, and the other sub-sample the remainder of the fields. We measured w{Q) with 
the same techniques described above. As shown in the inset of Figure 2, we found that the 
correlation function of each sub-samples is characterized by a small-scale break similar to 
that of the whole sample, suggesting that it is not the result of statistical fiuctuations, but 
it is a real feature of our sample. 



*Note that Szapudi, Meiksin & Nichol (1996) successfully used the method of factorial moments (with 
infinite sampling) even for slightly smaller average densities. 
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5.1. Mock Catalogues 

If the break is not the result of an improper measure (i.e. if it is present in our sample), 
that still does not mean that it reflects the clustering properties of the parent distribution, 
of which our galaxy fields are realizations. Since our sample of LBGs is relatively small, 
it could be, for example, that it is not representative of the parent distribution, and that 
the small-scale break is simply the result of normal fluctuations for the particular intrinsic 
clustering properties. In this section, we use numerical simulations to estimate the likelihood 
for this to occur. 

Specifically, we measure w{Q) (with the method of the factorial moments) from a large 
number of realizations of a point process which has the same large-scale clustering properties 
of the LBGs but no small-scale break to estimate the probability to detect a significant deficit 
of close pairs in an artificial sample similar to the real one. We generate the mock LEG 
samples from a lognormal random field that has intrinsic w{Q) equal to the best power-law 
fit to the data for 6 > 40 arcsec, namely w{Q) = 0.65/0°'^^, which is obtained assuming 
independent errors, and lies on top of the data points (note that using the other fitting 
functions discussed in Section 4 produces similar results). We generate the fields over a 
grid with physical size of 3600 x 3600 arcsec^ and grid step of 1.76 arcsec, smaller than the 
minimum distance between observed LBGs (1.92 arcsec). We subsequently extract the point 
sample by performing a Poisson sampling of the density field (see Appendix A). We assign 
a probability of finding "galaxies" at any given location as proportional to the intensity of 
the density field in that point, and we normalize it to obtain a distribution which has the 
same average density as the observed sample of LBGs. In this way, both the large-scale two 
point correlations and the surface density of objects are the same as those estimated from 
the galaxy sample (note that a grid point can host more than one galaxy). We generate eight 
different realizations of the field to extract eight samples of objects identical in shape and 
dimensions to the LBGs ones. We then estimate w from these mock catalogues by computing 
the factorial moments of the CDPF. The process is then repeated 1100 times to simulate the 
"cosmic variance". 

The results of the simulations are summarized in Figure 9, where the first two moments 
of the distribution oiw over the 1100 realizations are compared with the correlation function 
of the population. Note that the average correlation, (w)mock5 decreases monotonically with 
on all scales. The cosmic bias is always negative, and, on large scales, comparable with the 
cosmic error. What do the simulations say about the statistical significance of the observed 
break? The probability distribution of w{Q = 10 arcsec) over the 1100 realizations of the 
mock galaxy samples (Figure 10) shows that only in the 13% of the cases the measured 
correlation is smaller than the observed value, Wobs- This, however, does not quantify the 
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significance of the break. Since data points at different angular separations are strongly 
correlated, in those realizations where galaxy correlations at O = 10 arcsec are feeble, w 
tends to assume values that are smaller than the population value at every 0. For example, 
for > 30 arcsec, we find that only ~ 10% of these realizations have also w > w^hs- In 
other words, w{Q) of these samples does not show a break, it simply is small. What we need 
to test is how often a deviation from a smooth, power-law like behaviour is encountered, 
namely how often a correlation function which is monotonically decreasing with on large 
scales changes its behaviour on small ones. 

To test the shape of the correlation, we have used a non-parametric statistics as follows. 
First, for a given realization, we rank the values of w(0j) in order of increasing cell size 
0j. Subsequently, we generate a new set of ranks by sorting the values of w{Qi) from the 
largest to the smallest. For a monotonically decreasing function, like a power-law with 
/9 > 0, the two ranks coincide for every 0j. In general, however, the correlation function 
evaluated at a given 0j will get different ranks in the two ordering schemes. This difference 
can be quantified using tests suited to compare sets of ordinal numbers, like the Spearman 
or Kendall rank correlation coefficients (Kendall & Stuart 1969). Finally, in order to find 
how many mock catalogues show a break in the small-scale correlation function, we looked 
for those realizations that have the same rank correlation coefficient of the observed data 
at > 30 arcsec (which is 1, indicating perfect agreement between the two sets of ranks), 
and a rank correlation coefficient smaller or equal to that of the data at < 30 arcsec. 
Since we are interested in the global behaviour of the correlation function, to avoid local 
fluctuations, we considered only 10 values of 0, linearly equispaced between 10 and 100 
arcsec. Independently of the correlation coefficient used, we found that these requirements 
are realized by ~ 11% of the realizations. Visual inspection has been used to check that, for 
the selected realizations, w{Q) indeed showed a small-scale break. 

Thus, the simulations show that if the population of LBGs is clustered at all scales with 
a correlation function similar to the best fitting power-law at > 40 arcsec, fluctuations 
produce a break similar to the observed one in ~ 11% of the cases. Such a confldence level 
is not high enough to conclude that the break is a true feature of the clustering properties 
of LBGs and not due to statistical fluctuations. However, it is not small enough to reject 
its detection with any confldence, either. Clearly, the reality of the break needs to be 
investigated with larger samples. 

Other interesting conclusions can be drawn from the analysis of the mock catalogues. 
For instance, we can test the accuracy of the blockwise bootstrap method in estimating 
the variance and the bias of w. The comparison of the scatter of w in the mock samples 
with that in the bootstrap realizations discussed in Section 3.2, shows that at worst they 
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differ by ~ 30-40% on intermediate scales (20 ^ ^ 60 arcsec), with the bootstrap errors 
being larger. Note, however, that cosmic errors for w depend on the three and four-point 
correlation functions, and these are different in the mock and real catalogues. Moreover, the 
bootstrap analysis includes also field-to-field variations that are not present in the mock 
catalogues. For this reason, to assess the quality of our method for estimating the errors 
in the correlation function, it would be preferable to compare bootstrap and true errors of 
the same point process. To do this, we pick up a mock realization at random, perform a 
bootstrap analysis on it, and compare the results with the true scatter among the 1100 mock 
realizations. We find that the size of bootstrap errors depend on the realization from which 
they have been generated. If its correlation function is strongly biased low with respect 
to the population value, the bootstrap errors tend to underestimate the uncertainties in 
w by 40-50%. On the other hand, if the realization which has been bootstrapped shows 
particularly strong correlations, bootstrap errors will be overestimated. On average, we find 
that bootstrap errors are very reliable even though they tend to slightly underestimate the 
uncertainty in w. At the opposite, the bias of w can be underestimated by a factor as large 
as ~ 1.5 using the bootstrap method. 

We have also used the simulations to study how the PDF of the correlation estimator 
changes with the cell scale (see Szapudi et al. 2000 for a similar analysis in three dimensions). 
Accurate modeling of this would be extremely important for developing robust maximum 
likelihood methods. In Figure 12, we plot the PDF of w for cells with O = 10 arcsec and 
O = 100 arcsec. To facilitate the comparison of the distributions, we plot them as functions 
of the normalized residuals Aw = {w — {w))/a^, where a^ denotes the r.m.s. value of 
w — (w). In this way, both the probability distributions have vanishing mean and unitary 
variance. As expected, these distributions are positively skewed. However, the cosmic bias 
is always negative, meaning that it is more likely to underestimate the correlation function 
than to overestimate it. It is interesting to note that the skewness of these PDFs markedly 
increases when going from 6 = 10 arcsec (where shot-noise dominates the variance of the 
CIC) to G = 100 arcsec (where finite volume effects are the major source of uncertainty). 



6. ARE CURRENT SAMPLES REPRESENTATIVE OF THE LBG 

POPULATION? 

In this section, we consider the problem of whether or not the current samples provide a 
fair representation of the clustering properties of LBGs. For our purposes, a galaxy sample 
is "fair" if the correlation function measured from it differs from the population value by less 
than a specified (usually small) amount, within a reasonable confidence level. The analysis 
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of the mock catalogs is very useful to gain some insight into this problem. The simulations 
show that if the true clustering strength of LBGs is as strong as the one that we have 
measured, then we should expect correlation estimates extracted from samples similar in 
size and geometry to ours to be significantly biased. In addition, the cosmic scatter between 
the measures obtained from different samples should be large. For example. Figure 11 shows 
that the probability that the measure of w{Q) from a sample with the size of our dataset 
differs from the population value by less than 20% is 34% and 25% for O = 50 and 100 
arcsec, respectively. It also shows that the estimated value of w at O = 50 (100) arcsec 
is underestimated by more than a factor of 2 in the 21% (35 %) of the cases, while the 
probability of overestimating the correlation function by more than a factor 1.5 is instead 
very small, namely, 4% and 3% for O = 50 and 100 arcsec. This is consistent with the fact 
that the variance of w on these scales is comparable with the observed correlation Wobs (see 
Figure 9). The situation is worsened by the fact that also the bias of the estimator of the 
correlation function is of the same order of magnitude of Wohs, at least on large scales. Note 
also that the true bias is likely to be larger than the value derived from the simulations, 
because the intrinsic clustering strength of the galaxies is likely to be larger than the value 
used in the simulations. The fluctuations on the scales of our sample, therefore, will be 
larger. 

Two conclusions emerge from this discussion. Firstly, it is clear that much larger samples 
are needed to obtain precision measures of the correlation function of LBGs. Secondly, the 
simulations show that the direction of the bias that affect the available samples is such that 
the clustering strength of LBGs is probably underestimated by the current measures. Thus, 
the conclusion that these sources are strongly clustered in space (as we shall quantify later) 
is very likely a robust statement. 



7. THE CORRELATION FUNCTION FROM PAIR COUNTS 

As a final check, and to understand the stability of our results with the method of 
analysis, we re-computed the angular correlation function of LBGs using a different set of 
estimators, not involving GIG. Specifically, we have measured the function w{0), defined as 
the fractional excess of LBG pairs at angular separations smaller than 6 over the expectations 
from the Poisson distribution. This is expressed in terms of the two-point function w{6) as 
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where (f){9) is a weighting function which depends on the geometry of the sample as, 



E 



dfii ; dihSD(,\e2-0i\) (17) 



[bi) is the Dirac deha distribution, and the index i runs over the different LBG fields). The 
function 0(^) can be determined by Monte Carlo integration. For our sample, the normalized 
distribution ■?/' is very similar to the function given in equation B-10 (and shown in Figure 
Bl) with O ~ 360 arcsec, but has an extended tail at large angular separations (i.e. for 
Q > 500 arcsec). For angles much smaller than the sample size, ~ 2n6, and ip ~ 26, and 
assuming w{9) = A^O^^ with P < 2, one finds w{9) ~ [2Aw/{2 — P)]9~^. However, since 
in any finite sample there are less pairs of galaxies with separation equal to 6 than in the 
projected sphere, w{9) will soon depart from its asymptotic behaviour, and, if /3 > 0, it will 
assume larger values. This means that, if w{6) is a power-law, w{6) is not, while w{Q) is. 
This is one of the reasons why we have used w as our primary statistics. Useful information, 
however, are obtained from w{6), as we will show in a moment. 

A number of estimators oiw{6) have been proposed. These account for boundary effects 

and the geometry of the sample by comparing the distribution of the angular separations 

between the galaxies to that of a high-density Poisson process covering the sample area. We 

used the three estimators 

^ DD 

», = ^-1 (18) 

^ DD-2DR + RR 

"^ = RR ^'^^ 

^ DD-RR 

proposed by Peebles (1980), Landy & Szalay (1993), and Hamilton (1993) respectively, where 
DD, DR, and RR are the fractions of (distinct) data-data, data-random, random-random 
pairs with an angular separation smaller than 9, suitably normalized. As in Section 3, in 
one case we computed w{6) by combining all the fields in a single sample (which minimizes 
the integral constraint bias); in another case, we took the average of the the measures from 
each single field (which minimizes spurious signals coming from field-to-field variations). 
For ^ < 40 arcsec, the values of w{6) from the two methods (for each estimator) differ at 
most by ~ 0.05, with the first method giving higher correlations as in the CIC analysis. The 
difference decreases with increasing 6, becoming ~ 0.01 for 6 ^ 100 arcsec, suggesting that 
on large scales methods based on pair counts could be less affected by the integral constraint 
bias than CIC, perhaps because of they account for edge effects more effectively (see below). 

Results obtained combining all the fields together are shown in Figure 13. The estima- 
tors by Hamilton (1993) and Landy & Szalay (1993) are in nearly perfect agreement, while Wi 
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gives stronger correlations on large scales. This latter estimator is subject to an uncertainty 
proportional to the error in the galaxy surface density, SM (e.g. Hamilton 1993). While this 
error tends to zero when a large number of catalogues is considered, the expectation value of 
Wi still differs from the true correlation. This bias, which is proportional to w evaluated on 
the scale of sample, scales like the variance of the errors in the galaxy surface density. On 
the other hand, W2 and ws show fluctuations with respect to the expectation value which are 
proportional to (^A/")^, even when they are computed from the single fields. At separations 
which are small compared to the size of the single fields, the average over a large number 
of samples is affected by the same bias as wi, while at large separations the precise form of 
the correction depends on the particular estimator used to measure the correlation function 
(Hamilton 1993; Landy & Szalay 1993; Maddox et al. 1996). 

Essentially, w and w are averages of the same correlation function done with different 
weighting schemes. Specifically, w is obtained with a relatively wide smoothing kernel, while 
w is more sensible to local variations (at least for 9 < 300 arcsec, where ip is increasing with 
9) . Thus, w describes the global trend of the correlation function, while w emphasizes the 
scales at which large fluctuations with respect to the mean trend are found. This is useful, 
for instance, to study the clustering properties on very small scales, e.g. where w seems to 
depart from its smooth large-scale behaviour. This is illustrated in Figure 13, which shows 
that the pair-count statistics is in overall good agreement with the counts-in-cells analysis. 
On large scales, ^ ^ 50 arcsec, both W2 and w^ are well approximated by a power-law with 
slope (3 ^ 0.5, while Wi has a shallower one, (3 ~ 0.3. The clustering amplitude from the 
CIC analysis is intermediate between the various pair-counts estimates. In particular, W2 
and Ws are ~ 20% lower than the the CIC value (in agreement with the maximum likelihood 
analysis in Section 3.4.3), while wi is slightly higher. These differences reflect the magnitude 
with which the bias of the estimators affects the measures. 

The function w{9) also shows a weaker clustering strength at small scales than the 
extrapolation of the corresponding large-scale fit, of similar statistical significance as the 
one observed for w{Q). For ^ < 70 arcsec, the correlation functions in Figure 13 show large 
oscillations relative to the fit at larger scales. A lack of galaxy pairs at ^ < 25 arcsec respect 
to the expectation from the same fit is observed as well. For example, there are 224 pairs 
of LBGs separated by less than 20 arcsec in our sample, while in absence of the break we 
would have expected ~ 243.3 pairs (this is obtained assuming w{Q) = O.65/0'^'^^; if the 
galaxy correlation function is 20% lower than the CIC results, ~ 236.6 pairs are expected). 
Since the measures of w{9) and w{Q) are differently affected by shot noise, we interpret this 
lack of small-scale signal in both statistics as an additional indication that the feature is due 
to a real lack of pairs with small angular separations in the data, albeit detected with low 
signal-to-noise ratio. 
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The function w is better suited than w{Q) to study the correlation function at large 
scales. When using the CIC, we avoided edge effects to contaminate the measure of w{Q) at 
large separations by considering only those cells that did not overlap with the boundaries of 
the samples, which implied some loss of information. On the contrary, all the galaxy-galaxy 
separations are used to estimate w. Note that the relative weight of galaxies lying near the 
boundaries with respect to those sitting at the centre of a field increases with the angular 
separations. In particular, for separations much smaller than half the field size, objects at 
the edge have half the weight of those at the centre. The opposite situation is found for 
separations larger than half the sample size, where only objects at the edge can contribute. 
This suggests that large-scale correlations should be more reliably measured by w than by 
the CIC analysis. Using w, we find that the correlation function of LBGs deviates from a 
power-law behaviour also at large scales {6 > 180 arcsec, see the inset of Figure 13). This is 
evident with all the estimators we considered. However, we do not think that this feature is 
real. In fact, because of the integral constraint, we expect the estimated correlation function 
to oscillate around zero on scales comparable with the sample size. In correspondence of 
the first zero-crossing a sudden departure from any small-scale smooth behaviour is then 
expected. Since the break appears on a scale corresponding to one third of the linear size of 
our smallest fields, this could explain its presence. To test if the break is due to boundary 
effects of our estimators, we used the mock catalogues discussed in Section 5.1. We found 
that large-scale breaks similar to the observed one are quite common. Sometimes, adding a 
constant value to the correlation function is enough to restore the missing large-scale power. 
Note that the presence of this spurious break at ^ > 180 arcsec could explain the reason why 
G98 found a steeper correlation than reported here, since they considered all the data with 
angular separations 9 < 330 arcsec to determine the best-fitting power-law. 



8. DISCUSSION 

One of the motivations that led us to revisit the measure of the angular correlation 
function of LBGs was to test the robustness of the previous results, namely that these 
galaxies are characterized by strong spatial clustering, with a correlation length that rivals 
that of local galaxies. The new measure discussed here confirms this result, and also gives 
some insight on the shape of the correlation function. Interestingly, our error analysis shows 
that the conclusion that the clustering strength of 2; ~ 3 LBGs is larger than that of the mass 
for essentially all commonly adopted cosmological models holds (at the 3a level), implying 
that these sources are highly biased tracers of the mass distribution. 

We have used the Limber transform to derive from the w{6) both the spatial correlation 
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length and the bias parameter in a set of reference cosmological scenarios. We considered a 
matter dominated, low-density universe (OCDM) with present-day mass density parameter 
flu = 0.3, vacuum density parameter Q/^ = and Hubble constant Hq = 100 /ikms~^ Mpc~^, 
with h = 0.7; a flat, vacuum dominated, low-density universe (ACDM) with Qu = 0.3, 
Qa = 0.7 and h = 0.7; and an Einstein-de Sitter model (rCDM) with Qm = 1, ^a = 
and h = 0.5. In all cases, we have assumed that the linear power spectrum of density 
perturbations approximates the cold dark matter (CDM) one with primordial spectral index 
n = 1, transfer function by Bardeen et al (1986), and with spectral shape parameter F = 0.21. 
The amplitude of the power spectrum is fixed to reproduce the observed abundance of rich 
galaxy clusters in the local universe (e.g. Eke et al. 1996; Jenkins et al. 1998). 

To derive the bias parameter of the LBGs we first computed the non-linear autocorrela- 
tion of mass density fiuctuations at redshift z, C,m{f^, z) (with r in comoving units), adopting 
the algorithm by Peacock & Dodds (1996). We then used the small angle version of the 
Limber equation to calculate the angular correlation function of a set of objects which trace 
the mass density field, and are distributed in redshift like the LBGs, namely 
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with N{z)dz the galaxy number counts in the redshift shell z,z + dz, Du{z) the proper 
motion distance (known also as the transverse comoving diameter distance), and Ruiz) the 
Hubble radius at redshift z, 
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This assumes that: i) 6 <^ 1 (with 6 in radians); ii) the spatial correlation length [tqIz), 
such that ^[rQ{z), z] = 1) is much smaller than the depth of the survey; Hi) the thickness of 
the comoving shell in which A^ 7^ is comparable with the depth of the survey. All these 
requirements are satisfied for both the LBGs and the mass distribution. We derived N[z) 
from the redshift distribution presented by Giavalisco & Dickinson (2001), which includes 
546 spectroscopic redshifts, and interpolated the histogram of redshifts with a cubic spline. 
Finally, we computed Wm(0) by averaging Wui{0) over the distribution of angular separations 
corresponding to a circular cell (see equation (B-10) and Figure Bl). The effective bias 
parameter, bes, is defined as &eff(®) — w{Q)/wni{Q), and we computed it in the same range 
of G where we studied w{Q). The effective bias as a function of the angular separation 
is plotted in Figure 14, which shows that independently of the underlying cosmological 
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parameters, LBGs are strongly biased tracers of the mass distribution if this is similar to the 
CDM one. Moreover, for O > 30 arcsec, bcs shows little evolution with O. Choosing O = 60 
arcsec as a reference case, from the raw data we find 

ft «■ - 2 2+°-^ 2 8+°-^ 3 q+°-^ 

UcS — ^-^-o.s? ^-o-o.e' '-'•^--0.9 ' 

for OCDM, ACDM and rCDM, respectively. Correcting for the bias of the estimator, one 
gets 

h — 9 /1+0-4 q n+0-5 A 9+0.7 

in the three cases. 



The Limber equation can also be used to deproject the observed angular correlation 
function and derive the spatial correlation length, tq, of LBGs. For this calculation, we have 
neglected the possibility of the presence of a small-scale break in the two-point correlation 
function of LBGs, and used the best-fitting power-law model of the data at O > 40 arcsec 
as representative of the clustering properties of the population. The function w{6) has been 
obtained using equation (B-7) in Appendix B. Note that we have assumed that the power- 
law model that describes the data at < 100 arcsec still provides a good approximation to 
the angular correlation function on scales which have not been tested by the observations. 
In principle, this (unavoidable) extrapolation could introduce strong systematic errors in 
the measure of the spatial correlation length. However, the observed scatter in the galaxy 
number density between the fields is compatible with this hypothesis (see Section 3.3). In this 
case, also the spatial correlation function, ^(r, z), is described by a power-law. In particular, 
if the redshift dependence of this function can be factorized as ^{r, z) = F{z) (r/ro)"'*', then, 
the corresponding w{6) has the form w{6) = Ay^6~^, where /9 = 7 — 1 with 

pr. _Tw.i / [dz/Rn{z)]F{z)Dlpiz)N'iz) 

r(7/2) \r,r^,,Y ' ^ ^ 

Jo 

(Totsuji & Kihara 1969; Peebles 1980). The function F{z) is not known, but in the case of 
LBGs this is not a limitation, since the corresponding distribution of cosmic time is consider- 
ably narrower (and more peaked) than that of traditional flux-limited redshift surveys, and 
little evolution of their clustering strength is expected over such a narrow range of cosmic 
time. In this case, the function F{z) can be taken out of the integral in equation 23, and the 
quantity r^^z) = ro[-F(z)]^/'^ is the correlation length at the epoch of the observations. We 
have verifled this approximation by considering the three cases of flxed clustering pattern 
in proper coordinates {F{z) oc (1 + z)^^^^"'''), linearly growing clustering {F{z) oc D'^{z), 
with D^{z) the growth factor of linear density fluctuations), and flxed clustering pattern 
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in comoving coordinates {F{z) = const). Using these approximations returns values for tq 
which differ by a few parts in a thousand. 

Adopting the power-law models obtained using 4 principal components of the bootstrap 
errors to fit the raw data, we find: 

ro = 3.5l:\i h^^ Mpc; 4.11^;° h'^ Mpc; 2.41°;^ h'^ Mpc, 

for OCDM, ACDM and rCDM, respectively. Correcting for the bias of the correlation 
estimator, one gets 

ro = 3.9+};^ h-^ Mpc; 4.6+^;^ h'^ Mpc; 2.7+?;^ h'^ Mpc. 

The errorbars have been determined as follows. We assumed Gaussian residuals for the 
principal component analysis presented in Section 4 so that each point in the parameter 
space can be associated with a probability through the x^ function. An histogram of ro 
values has then been constructed, mapping the parameter space and using the x^ proba- 
bilities as weights. The 95% confidence level around the most probable value (determined 
by locating the 2.5 and 97.5 percentiles) has been taken as the uncertainty for tq. Note 
that, the most probable values tend to be slightly higher than those presented above (we 
get ro = 3.8, 4.3, 2.6 h~^ Mpc and ro = 4.4, 5.1, 3.0 h~^ Mpc for the raw and bias corrected 
data, respectively). We preferred to quote the correlation length corresponding to the best- 
fitting power-law model since this is not affected by the assumption of Gaussian residuals. 
However, since ro is obtained with a non-linear transformation of the parameters A and /9, 
this could introduce some systematic error in the final estimates. When we do not correct 
for the bias of the estimator, our results are in very good agreement with G98. Thus, de- 
spite the GIG method yielded an angular correlation function which is shallower and with 
a smaller amplitude that that presented by G98, the implied correlation length seems to 
be robust. This is because the parameters of the power-law model are strongly covariant, 
and the comoving correlation length ro is much more tightly constrained than either A oi (3 
individually. 

The interesting results of this study are that the correlation function of LBGs at 2; ~ 3 
is very well approximated by a power-law at angular separations 40 < ^ < 180 and, perhaps 
more importantly, that we confirm the strong spatial clustering, and hence implied large bias, 
reported in previous works (Steidel et al. 1998; G98; A98). The correlation length discussed 
here is somewhat smaller than that found by A98, who measured it using the counts-in-cells 
statistics in a three dimensional sample. Perhaps part of the difference can be explained by 
the possible presence of clustering segregation with the luminosity (Giavalisco & Dickinson 
2001), since the spectroscopic sample considered by A98 is about 0.5 magnitudes brighter 
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than the pure photometric samples used here and by G98. More hkely the difference could 
be due to statistical ffuctuations, since their sample is significantly smaller (in terms of 
number of galaxies) than the one discussed here and it includes less fields, making it more 
prone to cosmic variance effects. Redshift distortions, either due to peculiar motions or to 
uncertainties in the measure of the systemic redshifts of the galaxies (e.g. see Pettini et al. 
2001) could also contribute to the observed discrepancy. We are currently investigating this 
possibility (Porciani & Giavalisco, in preparation). Note that our results for fegfr and vq might 
be biased towards high values if our sample is affected by significant field-to-field variations, 
while they might be biased towards low values if inter-field fluctuations are not important 
and the bootstrap method underestimates the bias of w. 

The value of the slope of the correlation function is the main difference with the measure 
by G98, who report j3 = 0.98 ± 0.32, somewhat larger than the value O.SO^'^q'jq found here. 
This discrepancy could be a consequence of the wider range of angular separations considered 
by G98 for their power-law modeling of the data, i.e. 6 < 330 arcsec, in combination with 
the assumption that error bars at different angular separations are statistically independent. 
In fact, we found that the correlation function of LBGs has a large scale break at ^ ^ 180 
arcsec probably due to the integral constraint bias. Note, however, that even though the 
best-fitting values for j3 differ by a factor of 2, the 1 a errors significantly overlap. The result 
f3 ~ 0.5 is more in line with the values observed at intermediate redshift (Le Fevre et al. 
1996; Carlberg et al. 1997). This is an interesting and useful constraint to the models of 
galaxy formation. 

The strong spatial clustering of LBGs suggests that they are associated with massive 
structures. In other words, to simultaneously reproduce the clustering strength and the 
spatial abundance of the LBGs in the framework of the cold dark matter model, their hosting 
dark matter halos must be relatively massive (e.g. see the discussion in G98; A98; Giavalisco 
& Dickinson 2001). Note that the individual galaxies need not coincide with the massive 
halos, but simply to trace their spatial position. For example, a strong clustering would also 
be observed in a scenario where LBGs are associated with sub-halos of low mass, which are 
satellites of massive ones (e.g. Kolatt et al. 1999; Wechsler et al. 2000). In this model, 
the sub-halos become active star formers, and thus observable, as a result of merging and 
interactions. Whatever the specific mechanism, it is important to realize that for strong 
clustering to be observable it is necessary that the galaxies that numerically dominate the 
sample must be associated with strongly clustered regions of the mass density field. An 
implication is that "field halos" of small mass (i.e. not associated with more massive ones), 
cannot significantly populate the samples, because they are much more numerous and less 
clustered than the observations would imply. Thus, as discussed in Giavalisco & Dickinson 
(2001), the clustering strength can be used, in conjunction with an assumed mass spectrum. 
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to constrain the relationship between mass and UV luminosity. 

An intriguing result of this study is the possibility that the correlation function has 
a break at small angular scales, O ^S 30 arcsec, where it seems to become smaller than the 
extrapolation of the power-law fitted at large scales. The x^ test and the fact that the feature 
reproduces in each of the sub-samples if we split our LBG sample in two parts suggest that 
the break is real. On the other hand, the confidence level of the break detection taking the 
numerical simulations at face value is marginal, namely 90%, and more data are required to 
confirm it or reject it. It is nonetheless interesting to briefiy discuss some of the implications. 

A small-scale break means that there are too few pairs of galaxies luminous enough to 
be included in our sample that have angular separations smaller than ~ 30 arcsec, compared 
to the expectations of the power-law correlation function. This can occur, for example, 
if the substructure within the halos hosting the galaxies is such that only one galaxy per 
halo is, on average, bright enough to be detected by our survey (the presence of fainter 
galaxies is, of course, unconstrained). In this case the break is simply the result of the 
average size of the halos, namely of the fact that dark matter halos hosting LBGs have 
finite size and are mutually exclusive in space ^. At 2; = 3, the scale of the break, say 
Obr = 25 ± 5 arcsec, corresponds to a comoving size of 0.52 ± 0.10 /i^^ Mpc for OCDM, 
0.54 ±0.11/1^1 Mpc (ACDM), and 0.36 ± 0.07 /i'^ Mpc (rCDM). Assuming that dark matter 
halos form from spherically symmetric density perturbations, and that their virialization 
epoch is twice the turnaround time, these lengths correspond to the diameters of objects 
with mass 2.1+^;^ x lO^^j^^© (OCDM), 1.8+J| x W^Mq (ACDM), 2.5+};^ x IO^^^q (rCDM). 
Thus, the mass scale associated with the break is consistent with that of halos with mass 
and volume density required to reproduce the observed abundance and clustering strength 
(e.g. see Giavalisco & Dickinson 2001). Interestingly, these values are largely insensitive to 
the cosmology. 

In view of the importance of the detection of the break as a possible indicator of the 
mass of the galaxies, we have compared the observed w{Q) to the prediction from the spatial 
exclusion model. We have assumed that there is only one visible LBG per dark matter 
halo, and that its position coincides with the "center" of the halo. Then, the correlation 
function of the galaxies coincides with that of the halos, and a simple model for their angular 
clustering, accounting for mutual exclusion, can be built as follows. We have also assumed 
that the cross-correlation function between halos of mass Mi and M2 at redshifts z is given 



^This is an example in which Poisson saniphng is not vahd, and sub-Poisson fluctuations are obtained 
from the counts in three-dimensional cells (see, e.g. Mo & White 1996). 



-38- 



by 

^i,{r,Mi,M2,z) = < (24) 

[—1, otherwise 

where b{M, z) is the hnear bias parameter of dark matter halos of mass M that have viriahzed 
at redshift z (Mo & White 1996), and the Eulerian radii of the collapsed halos, Ri and R2, 
are determined assuming that halos originated from the collapse of spherically symmetric 
perturbations that virialized at the epoch corresponding to two turnaround times (e.g. Lacey 
& Cole 1993; Kochanek 1995; Bryan & Norman 1998). The mass density autocorrelation 
function, ^^{r,z) is computed using the method by Peacock & Dodds (1996), as described 
earlier. From that we have derived the halo correlation function using the Press-Schechter 
mass function n{M, z) (Press & Schechter 1974) as 



o?Mi / dM2n{Mi, z) n{M2, z) ^i,{r, Ml, M2, z) 

Ur, z) = '- ^- — -^ . (25) 

dMn{M, z) 

Finally, we have used the Limber equation in the small angle approximation to transform ^-^ 
into the angular correlation function of the halos (cf. Coles et al. 1998), and have calculated 
w(9) using equations (B-8) and (B-10). Since our sample of LBGs is flux limited, we have 
considered mass-limited samples of halos in the calculation. Figure 15 shows the results 
obtained using dark matter halos with different mass thresholds compared to the observed 
correlation function in the ACDM cosmology. The halo exclusion effect is clearly observable 
in the model w(6), despite the strong dilution of the clustering signal due to the angular 
projection. Although this model is rather crude (for example, it does not take into account 
effects such as non-linear biasing, which are expected to be important in the halo correlation 
function at small scales, cf. Catelan et al. 1997; Porciani et al. 1998; Porciani, Catelan & 
Lacey 1998), it is in general good qualitative agreement with the observations, predicting a 
w{&) that reaches a maximum for roughly equal to the angular size of the smallest halo 
in the mass-limited sample. At larger scales the model w(0) approximates a power-law and 
is unaffected by the exclusion effect, in excellent agreement with the data. At smaller scales, 
it declines towards an asymptotic constant value, but unfortunately there the data are too 
noisy for any meaningful comparison. 

The assumption of a mass-limited sample of halos as the one associated to a fiux-limited 
sample of galaxies is most likely an oversimplification, since it implies a very tight correlation 
between UV luminosity and mass, which we do not expect. Releasing this assumption and 
allowing some dispersion between UV luminosity and mass has three effects. Firstly, by 
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allowing a larger fraction of smaller lialos (which are numerically more abundant than the 
massive ones), it decreases the overall clustering strength, in particular the value oiw{Q) at 
large scales. Secondly, it affects the number density of the galaxies. And thirdly, it changes 
the angular scale of the the break, since this is dominated by the size of the most abundant 
type of halos in the sample. Our data are clearly not adequate to constrain these three 
effects in a realistic way, but this should be possible with future high-precision measures of 
the LBGs angular clustering, particularly if done as a function of the UV luminosity of the 
galaxies. Nonetheless, it is interesting to point out that the model with the best quantitative 
agreement with the observation is the one with M > 5 x 10^^ M©, which reproduces both 
the large-scale amplitude and the scale of the break of the observed w{Q), since it also 
predicts spatial abundances in good agreement with the observations (G98; A98; Giavalisco 
& Dickinson 2001). 

Finally, we also remind that the clustering properties discussed here, including the slope 
of the correlation function, its amplitude and the possible detection of the small-scale break, 
are all relative to the galaxies selected with the color-criteria specified by Equation 1. Galax- 
ies at similar redshifts as our LBG sample but with different spectral energy distribution 
(e.g. star-forming galaxies with substantial dust reddening or "old" galaxies) will elude 
such color criteria, and, in principle, can have different clustering properties. Since we do 
not know the properties of galaxies at z ~ 3 missed by the LBG selection criteria (but see 
the discussion in Steidel et al. 1999 and Adelberger & Steidel 2000), it is important to keep 
in mind that our knowledge of galaxy clustering at z ~ 3 comes from the samples of LBG 
galaxies discussed here. 

In summary: 

1. we made a new measure of the angular correlation function of LBGs at 2; ~ 3 using 
the count-in-cell statistics, which is significantly less affected by shot noise than the 
previous ones based on the pair counts. The new measure confirms the strong spatial 
clustering of these galaxies, and a Limber deprojection yields a comoving correlation 
length 

ro = 3.5t\ih^^Mpc; 4.1+^;° /i"^ Mpc; 2.4+°;^ /i^^ Mpc, 

for OCDM, ACDM and rCDM, respectively. Very likely these values need to be cor- 
rected upwards by at least 10% due to the integral constraint bias. 

2. the strong clustering implies that LBGs are heavily biased tracers of the mass distri- 
bution. Assuming a CDM scenario, we derived the following values for their linear bias 

parameter 

u _ 9 9+0.5 9 0+0.5 o Q+0.8 

L'cff — ^-^-0.5' ^-o-ce' "-"-^-c.g 5 



-40- 



in our three adopted cosmological models, respectively. Similarly to the correlation 
length, these values have probably been underestimated by at least 7%. 

3. in the range of angular separations 30 < ^ < 100 arcsec, the correlation function is 
very well approximated by a power-law with slope (3 ~ 0.5 (or spatial slope 7 = 1.5), 
significantly shallower than that from the pair-count measures oiw{d) (G98; Giavalisco 
& Dickinson 2001); 

4. we used numerical simulations to quantify the effects of the integral constraint bias in 
our measure, and found that that present samples do not provide a fair representation 
of the LBG population, with the current measure of correlation length being very likely 
a lower limit; 

5. at small angular scales [9 < 30 arcsec), the correlation function seems to depart the 
power-law model that describes its large-scale behaviour, and becomes smaller. This 
effect is detected at the ~ 90% confidence level, and, if confirmed, it will set a strong 
constraint of the multiplicity function of the halos (Peacock & Smith 2001), namely 
the number of LBGs per halo detectable in our sample, which has flux limit TZ < 25.5. 
Assuming the effect is real, the shape of w{9) is consistent with one observable LBG 
per halo on average (the presence of fainter galaxies is, of course, unconstrained) and 
with the halos of hosting galaxies with 25 ;$, 7^ ^S 25.5 having mass of the order of 10^^ 

But perhaps the most important result of the discussion above is that it illustrates 
how the detailed knowledge of the clustering properties of LBGs can provide important 
constraints on the physics of galaxy formation, and highlights the importance of precision 
measures of the correlation function of galaxies at high redshift. This creates, we believe, 
a strong case for building larger, higher quality samples of galaxies at high redshifts. We 
plan to return on this issue with a future work, where we will discuss new data from a larger 
survey of LBGs at 2; ~ 3. 
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Alex Szalay, and Istvan Szapudi for very useful discussions. Finally, we are grateful to 
an anonymous referee for his/her useful comments. CP acknowledges the support of a 
Golda Meir fellowship at HU and of the EC RTN network "The Physics of the Intergalactic 
Medium" at the loA. 
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APPENDIX: Poisson Sampling and Shot Noise 

Theoretical models for structure formation describe the cosmic mass distribution using 
a stochastic field with a continuous support ^°. On the other hand, the galaxy distribution 
revealed by astronomical observations has a different nature, being, practically, a point 
process. In spite of this, it is sometimes convenient to describe the galaxy distribution 
by means of an (ideal) continuous random field. In this scheme, galaxy distributions are 
thought as discrete samplings of the continuous field. Two levels of stochasticity are implicit 
here: a given continuous density distribution is first drawn from the ensemble, and a set 
of discrete objects is subsequently generated from the selected realization. We will term as 
discreteness, or "shot noise" , terms those contributions to the statistics of galaxy counts that 
derive explicitly from the sampling procedure. 

There is no unique way of building a discrete distribution out of a continuous field. This 
has been often overlooked, making the fortune of a particularly simple algorithm, originally 
proposed by Layzer (1956), and universally known as the "Poisson sampling" method (here- 
after PS). The key assumptions of this sampling technique are that the probability of finding 
a galaxy in the infinitesimal volume dV centered in x is i) proportional to the value of the 
continuous density field evaluated at x; and ii) independent of the probability of finding 
a galaxy in the neighbouring volume elements. With this choice, the (ensemble averaged) 
spatial correlations of the final (discrete) distribution, being unaffected by the sampling pro- 
cedure, coincide with those of the underlying continuous field. Since PS acts locally, the 
number of galaxies contained within a finite volume (of size V) is a random variable whose 
statistics depend only on the locally averaged overdensity 6v- In particular, the conditional 
probability of finding N galaxies for a given value of 6v is a Poisson distribution with mean 
A = ng(l + Sy)V, namely 

with Ug the mean number density of galaxies. The probability of finding N galaxies within 
a generic volume of size V is then 

/CO 
V{5y)p{N\5v)d5v , (A-2) 

with V{5v) the probability density function (PDF) of the volume averaged overdensity. 



^^Hereafter we will use the term "continuous field" referring to the continuous nature of the space that 
supports the random field, and not to analytical continuity. 
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It can be shown that, if the moment generating function of ^v', M{t) = {exp{itSv)) , 
exists and is well-behaved, the corresponding quantity for the discrete counts, M{t), is 
obtained through the replacement M{t) = Ai{e^ — 1) (Fry 1985). In particular, one has 
(White 1979) 



M(t) = exp 






n=l 



n\ 



V Jv 



(A-3) 



i.e. the moment generating function of the counts depends on the full hierarchy of (connected) 
correlation functions, ^„, of the underlying continuous field. In all cases, the moments of the 
discrete counts are given by (e.g. Peebles 1980) 

(N) = N 

{{N-Nf)= N + N% (A-4) 



with ^n = V"~" Jy (i^ri- ■ ■ J^ (i'^r„^„(ri, . . . , r„). When N is so small that the first term 
in the r.h.s. dominates for each moment, the distribution of counts reduces to a Poisson 
distribution with mean N and it is said to be discreteness (or shot-noise) dominated. On the 
other hand, when N is large, the distribution in {N — N)/N reverts to the input distribution 
in the continuous variable Sy- Analogous relations hold for the average angular correlations. 
In particular, for the two-point correlation considered in the main text, is simply 

{{N-Nri_i_ 

""- m AT' ^^'^^ 

where A^ represents the counts performed in two-dimensional cells. The correlation function 
is obtained subtracting the shot-noise term 1/A^ from a non-linear combination of the first 
two moments of the counts. 

The standard approach in cosmology is to use equation (A-4) to define the average 
correlation function of the galaxy population, C,n- However, it is important to keep in mind 
that the Poisson sampling is only a model to generate a population of discrete objects tracing 
a continuous density distribution, and we do not know if it is applicable to galaxies. It is easy 
to think of point processes that are not obtained from this description. Such a distributions, 
like the simple case of hard spheres, can even show sub-Poisson fiuctuations {{N — N^) < N, 
while ^2 > for a continuum distribution. 



-43- 



APPENDIX: Spectral Analysis of 2-D Random Fields 

Let us consider the random field, (5(x), defined on the Euclidean plane, Ti?, and having 
vanishing mean value. Such a field can be conveniently used to describe the galaxy over- 
density field over small regions of the sky, where the curvature of the celestial sphere can be 
neglected. If we assume that 5 is stationary (i.e. its statistical properties are invariant over 
spatial translations and rotations), its power-spectrum can be defined as follows: 

(5(ki)5(k2)) = (27r)25D(ki + k2)P2(A;i); , (B-1) 

where (5(k) = J5(x) exp (— ik ■ x) rf^x is the Fourier transform of the stochastic field, and 
5d denotes the Dirac delta distribution. The autocorrelation function of 5 is related to the 
power-spectrum through 

P^ik) exp (^k ■ r)— — = — / kP2{k)Jo{kr)dk , (B-2) 
(27r)^ 27r Jq 

with Jo{x) the spherical Bessel function of zero order. In words: the power-spectrum and 
the autocorrelation function are the Hankel transform (rotationally symmetric Fourier trans- 
form) of each other. The inverse relation is 

POO 

P2{k) = 2tx j rw{r)Jo{kr)dr . (B-3) 

Jo 

In particular, for a scale-invariant power distribution P2{k) = Ak"-, with — 2 < n < —1/2, 
one gets 

n2("-i)r(n/2) , ,,.. 
"''•'= .r(-n/2) -^'"' '■ (^-^' 

It is often convenient to "observe" the fluctuation field with a finite resolution R: 
(5(x; R) = f S{y)F{\y — x|; R)d'^y. In this case, the observed power-spectrum, P2{k)W'^{kR) 
(with W{kR) the Fourier transform of the smoothing kernel F), will be severely damped for 
k > 1/R with respect to P2{k). Thus, the variance of the CPDF in cells of characteristic 
(one-dimensional) size R, w{R), can be directly computed integrating the power-spectrum 

w{R) = lim(5(x; i?)(5(x + r; R)) = — / kP2{k)W\kR)dk . (B-5) 

The appropriate smoothing kernel for the circular "top-hat" cells discussed in the main text 
is 

WTuikR) = ^ I dr I d9rexp\-ikrcos{9)] = ^Ji{kR) , (B-6) 

T^^ Jo Jo kR 
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with Ji{x) the spherical Bessel function of order one. For P2{k) = Ak"', with —2 < n < 2, 
one eventually obtains 

^n(m - --4 r[(l/2)-(n/2)]r(n/2) (,^,, _ -1 r[(l/2) - (n/2)] 

^^^^-^ r(2-n/2)r(-n/2) ^ " nTrV^s-i r(2 - n/2) ^^^^ ' ^^"^^ 

For isotropic random fields, the variance of the smoothed field, w{R), can be expressed 
as a weighted average of the correlation function w{r), 

w{R) = / w{xR) P{x) dx , (B-8) 

with P{x) the probability density function of the normalized separations x = r/R between 
points lying within the smoothing surface. This function can be determined by substituting 
equation (B-3) into equation (B-5). Exchanging the order of the two integrals, one obtains 

/•oo 

P{x)=x yUxy)W\y)dy. (B-9) 

In particular, for the window function in equation (B-6), 

PM^) = Ax / -Mxy) Jl{y) dy . (B-10) 

Jo y 

This function has a bell shape, and reaches its maximum value for x ~ 0.84 (see Figure Bl). 
For a; -^ 0, PthIs^) asymptotically matches the function 2x, and, as expected, it vanishes for 
x>2. 
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Table 1. The Observed Fields 



# 


Field 


Size^ 


Nb 


^f' 


1 


0050+123 (CDF) 


159.8 


177 


1.11 


la 


0050+123 (CDFa) 


8.8 X 8.9 


80 


1.02 


lb 


0050+123 (CDFb) 


9.1 X 9.1 


97 


1.18 


2 


1234+625 (HDF) 


8.6 X 8.7 


106 


1.41 


3 


1415+527 (Westphal) 


15.0 X 15.1 


287 


1.27 


4 


2215+000 (SSA22) 


153.4 


190 


1.24 


4a 


2215+000 (SSA22a) 


8.6 X 8.9 


116 


1.49 


4b 


2215+000 (SSA22b) 


8.6 X 9.0 


75 


0.97 


5 


2237+114 (DSF2237) 


159.7 


211 


1.32 


5a 


2237+114 (DSF2237a) 


9.1 X 9.2 


89 


1.07 


5b 


2237+114 (DSF2237b) 


9.0 X 9.1 


126 


1.55 



^In units of arcmin^. 

^Number of LBG candidates with TZ < 25.5. 

'^Surface density at 7^ < 25.5; galaxies per arcmin^. 
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Table 2. Best-fitting parameters for w{Q). Power-law fit. No bias correction. 



Data 


Err.^ 


iVfit" 


J var 


^best 


Pbest 


Xmin/d-o.f. 


A range 


P range 














(Ax' = 1) 


(Ax' 


= 1) 






40" < < 100" 


C 


3 


0.9991 


0.57 


0.49 


0.10/1 


0.05 


2.31 


0.00 


0.76 


40" < < 100" 


C 


4 


0.9998 


0.56 


0.49 


0.11/2 


0.05 


2.18 


0.00 


0.75 


40" < < 100" 


C 


5 


0.9999 


051 


0.47 


0.11/3 


0.05 


2.01 


0.03 


0.73 


40" < < 100" 


C 


6 


0.9999 


0.59 


0.50 


0.28/4 


0.08 


2.09 


0.10 


0.74 


40" < < 100" 


c 


7 


0.9999 


0.64 


0.52 


0.79/5 


0.07 


2.25 


0.08 


0.76 


40" < < 100" 


c 


8 


0.9999 


0.61 


0.51 


1.01/6 


0.06 


2.23 


0.04 


0.76 


40" < < 100" 


u 


30 


1.0000 


0.65 


0.51 


0.02/30 


0.16 


2.04 


0.20 


0.78 


8" < < 100" 


c 


3 


0.9926 


0.16 


0.21 


2.22/1 


0.03 


0.54 


-0.13 


0.43 


8" < < 100" 


c 


4 


0.9977 


0.10 


0.12 


3.61/2 


0.01 


0.35 


-0.23 


0.34 


8" < < 100" 


c 


5 


0.9988 


0.09 


0.11 


4.89/3 


0.01 


0.33 


-0.21 


0.34 


8" < < 100" 


c 


6 


0.9995 


0.12 


0.16 


5.07/4 


0.03 


0.35 


-0.12 


0.35 


8" < < 100" 


c 


7 


0.9997 


0.12 


0.16 


5.23/5 


0.03 


0.31 


-0.13 


0.33 


8" < < 100" 


c 


8 


0.9999 


0.08 


0.09 


6.17/6 


0.02 


0.23 


-0.20 


0.27 


8" < < 100" 


u 


47 


1.0000 


0.24 


0.28 


2.42/45 


0.14 


0.42 


0.15 


0.40 



^C and U stand for correlated and uncorrelated errors, respectively. 

'^Number of principal components used in the least-squares analysis. 

'^Fraction of the total variance accounted for by the principal components considered. 

'^In units of arcsec'^. 
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Table 3. Best-fitting parameters for w{Q). Power-law fit. Bias-corrected data. 



Data 


Err. 


iVfit 


/ var 


^best 


Pbest 


xLn/d.o.f. 


A range 


P range 














(Ax' = 1) 


(Ax' 


= 1) 






40" < < 100" 


C 


3 


0.9991 


0.40 


0.37 


0.08/1 


0.05 


1.56 


-0.05 


0.63 


40" < < 100" 


c 


4 


0.9998 


0.45 


0.39 


0.14/2 


0.07 


1.59 


0.01 


0.64 


40" < < 100" 


c 


5 


0.9999 


0.34 


0.33 


0.39/3 


0.06 


1.31 


-0.03 


0.59 


40" < < 100" 


c 


6 


0.9999 


0.39 


0.36 


0.42/4 


0.07 


1.33 


0.03 


0.60 


40" < < 100" 


c 


7 


0.9999 


0.39 


0.36 


0.48/5 


0.07 


1.33 


0.00 


0.60 


40" < < 100" 


c 


8 


0.9999 


0.37 


0.35 


0.78/6 


0.06 


1.29 


-0.02 


0.59 


40" < < 100" 


u 


30 


1.0000 


0.42 


0.37 


0.02/33 


0.14 


1.31 


0.15 


0.63 


8" < < 100" 


c 


3 


0.9926 


0.18 


0.19 


1.78/1 


0.04 


0.55 


-0.11 


0.39 


8" < < 100" 


c 


4 


0.9977 


0.12 


0.11 


3.23/2 


0.02 


0.35 


-0.26 


0.30 


8" < < 100" 


c 


5 


0.9988 


0.12 


0.12 


4.90/3 


0.02 


0.34 


-0.25 


0.31 


8" < < 100" 


c 


6 


0.9995 


0.16 


0.17 


5.52/4 


0.06 


0.42 


-0.03 


0.34 


8" < < 100" 


c 


'~7 

1 


0.9997 


0.15 


0.16 


6.08/5 


0.05 


0.34 


-0.07 


0.30 


8" < < 100" 


c 


8 


0.9999 


0.12 


0.12 


6.38/6 


0.04 


0.28 


-0.10 


0.27 


8" < < 100" 


u 


47 


1.0000 


0.24 


0.24 


2.05/45 


0.14 


0.40 


0.12 


0.34 
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Fig. 1. — Average correlation function of LBGs (points) estimated computing the factorial 
moments of the counts, and relative uncertainty (errorbars). The long-dashed line shows the 
bias of the estimator used to determine w. 
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Fig. 2. — Average correlation function of LBGs estimated computing the factorial moments 
of the counts in 8 different field. In the inset, the average of the correlations shown in the 
main panel (squares) is compared with the results obtained in Section 3.1 (triangles). Open 
and filled symbols refer to quantities computed excluding or including the Westphal field, 
respectively. The dashed line shows the standard deviation between the functions shown in 
the main panel. The continuous line marks the correlation function in the Westphal field. 
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Fig. 3. — Average correlation function of LBGs estimated performing a maximum likelihood 
analysis of the CPDF (points with errorbars). The continuous line shows the best estimate 
oiw in Figure 1. Left and right panels have been obtained assuming that the CPDF approx- 
imates a negative binomial and a Poisson sampled lognormal distribution, respectively. For 
each 9, to reduce the effects of statistical fluctuations, only the values of A^ that have been 
measured in at least 1000 different cells have been considered for the maximum likelihood 
analysis. In the top panels, errors in the CPDF at different values of N are assumed to be 
statistically independent. In the bottom panels, principal component analysis is used to deal 
with correlated errors. The number of principal components included in the analysis has 
been fixed by requiring that they accounted for the 95% of the total variance. 
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Fig. 4. — Top: Average correlation function of LBGs estimated performing a maximum 
likelihood analysis of the counts-in-cells (filled points). Errorbars (which are smaller than 
the symbols used to mark the best fitting values) denote only measurement errors, and should 
be combined with cosmic errors to obtain the full uncertainties. The continuous line and 
the shaded region show the best estimate for w in Figure 1 and its total uncertainty. Left 
and right panels have been obtained assuming a negative binomial and a Poisson sampled 
lognormal CPDF, respectively. Bottom: Average correlation function for LBGs estimated 
performing a Kolmogorov-Smirnov test on the CPDF. The notation is as in the top panel. 
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Fig. 5. — Best-fitting models to the measured CPDF obtained using the Kolmogorov- 
Smirnov test. Points denote the best estimates of the CPDF from the data. Bootstrap 
errorbars are also drawn. The dotted lines show the Poisson distributions with the same av- 
erage counts of the observed CPDF. The dashed and continuous lines represent, respectively, 
the best-fitting negative binomial and lognormal models. 
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Fig. 6. — Principal components of the correlation errors. The best estimate of the average 
correlation function is marked by the points. The short dashed line denotes the size of the 
1(7 errorbars in Figure 1. Note that the signal-to-noise ratio equals 1 for G ~ 15 arcsec. The 
long-dashed line shows the bias of the estimator used to determine w. The continuous lines 
are obtained displacing the data points proportionally to the first two principal components 
of the errors. The top and bottom heavy lines correspond to a pure first principal component 
error with amplitude ±1/a/Ai (which, in a standard least squares analysis corresponds to 
X^ = 1). The intermediate light lines correspond to a pure second principal component 
error with amplitude ±l/-\/A2 (again corresponding to x^ = !)• The dash-dotted line is 
obtained combining first and second principal components each with amplitude l/^/A^ (i.e. 
has x^ = 2). Note how the shape of the correlation function is poorly constrained by the 
data, especially on small scales. 
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Fig. 7. — Contour levels of the x^ function obtained by fitting a power-law model, AO"^, 
to the observed function w{Q) without correcting the data for the bias of the correlation 
estimator. The main body of the figure refers to the data at > 40 arcsec. The open circle 
marks the position corresponding to the minimum value of the x^ function. The continuous 
lines show the levels Ax^ = 2.30 and Ax^ = 6.17, which, for Gaussian residuals, correspond 
to the joint 68.3% and 95.4% confidence levels, respectively. The dashed line shows the 
function in equation 13. Bottom inset: The contours of the main panel are compared with 
those obtained from the data with > 8 arcsec. The levels Ax"^ = 2.30 and Ax"^ = 6.17 are 
marked by the heavy lines, while the parameters for the best-fitting model are denoted by a 
filled circle. Top inset: the measured correlation function (crosses) is compared with the best- 
fitting models at > 40 arcsec with correlated errors (continuous line), and independent 
errors (dotted line). The best-fitting model for > 8 arcsec is marked by a dashed line. 



-59- 



8 



6 



< 4 







1^ 











'^^'^X^-^^. 



_l I I I I I L_ 



xxxxxxx 



_l I I I L_ 



20 40 60 80 100 

©(arcsec) 



1 - 



0.5 







1 


III 


1 


'// - 




r\^ o 


/ 


/ f 




M = U arcsec 




_ . _ 


/j 


//\ 


- 


(y> 4U arcsec 


- 


A 


w 


/ -_ 


_ 


-=?«=*^^^ 


1 


- 



"1 I r 



Iff 



0> 40 arcsec 



0.5 



J L 



J I L 



J I I L 



0.5 







0.5 



iS 



Fig. 8. — As in Figure 7 but for the bias-corrected data. The dashed hne shows the function 
in equation 14. 
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Fig. 9. — Average correlation function estimated computing the factorial moments of the 
counts in a set of mock galaxy catalogues with: the same average density, size, and large- 
scale correlation function of the observed sample. The continuous line shows the population 
correlation function, w{<d) = O.65/0°'^^ (with 6 in arcsec). The average of w over 1100 
realizations of the point process is denoted by filled squares. One a fluctuations of the 
estimator w are marked by the shaded region. Filled circles represent the observed w in 
Figure 1. 
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Fig. 10. — Probability distribution of the values assumed by the function w a.t <d = 10 arcsec 
in the mock catalogues described in Section 5.1 (histogram). The corresponding population 
value is marked by the short-dashed, vertical line. The mean and standard deviation of the 
distribution are shown with a continuous line and a horizontal errorbar, respectively. The 
long-dashed line denotes w{Q = 10 arcsec) in Figure 1. 
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Fig. 11. — Probability distribution of the ratio between the estimated and the population 
values of w in the mock catalogues described in Section 5.1. Top and bottom panels refer, 
respectively, to = 50 arcsec and O = 100 arcsec. The average and the standard deviation 
of the distribution are 0.80 ± 0.36 (6 = 50 arcsec) and 0.68 ± 0.37 (6 = 100 arcsec). 
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Fig. 12. — Probability distribution of the fluctuations of the estimator w computed with 
respect to the average value and normalized to the standard deviation. The symbol 71 
denotes the Fisher skewness of the distribution. The population value for w is marked by 
a vertical line. Top and bottom panels refer, respectively, to O = 10 arcsec and = 100 



arcsec. 
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Fig. 13. — Average correlation function wlO) estimated from pair-counts. Filled dots, open 
squares and stars refer to the Landy & Szalay, Hamilton, and Peebles estimators, respectively. 
The dashed line marks the size of Poisson errors for the Landy & Szalay estimator. The 
continuous line shows the function U7(^) corresponding to the power-law model w{Q) = 
O.GS/G*^'^^ (with B in arcsec) which approximates the results from the CIC analysis. The 
dotted line is obtained by multiplying the continuous curve by 0.8. The inset presents the 
results of the Landy Sz Szalay estimator in logarithmic space. The notation is as in the main 
panel. 
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Fig. 14. — Effective bias of LBGs as a function of the radius of the cells within which the 
counts are performed. Different panels refer to different cosmological models as described in 
Section 8. 
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Fig. 15. — Halo exclusion effects and galaxy correlation function. The continuous lines denote 
the correlation function for mutually excluding dark matter halos in four different mass 
ranges as indicated by the labels. Filled points and the shaded region indicate the observed 
correlation of LBGs and its uncertainty, as shown in Figure 1. The dashed line represents 
the mass correlation function multiplied by the square of the effective bias parameter for 
halos with M > 5 x 10^^ Mq and with the same redshift distribution of observed LBGs. 
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Fig. Bl. — Distribution function of the normalized angular separations, 6/Q, between points 
lying within a circular cell of radius 0. The analytic expression of this function is given in 
equation (B-10). 



